Heterologous biosynthesis of nodulisporic acid

ABSTRACT

Nodulisporic acids (NAs) comprise a group of indole diterpenes known for their potent insecticidal activities; however, biosynthesis of NAs by its natural producer,  Hypoxylon pulicicidum  ( Nodulisporium  sp.) is exceptionally difficult to achieve. The identification of genes responsible for NA production could enable biosynthetic pathway optimization to provide access to NAs for commercial applications. Obtaining useful quantities of NAs using published fermentations methods is challenging, making gene knockout studies an undesirable method to confirm gene function. Alternatively, heterologous gene expression of  H. pulicicidum  genes in a more robust host species like  Penicillium paxilli  provides a way to rapidly identify the function of genes that play a role in NA biosynthesis. In this work, we identified the function of four secondary-metabolic genes necessary for the biosynthesis of nodulisporic acid F (NAF) and reconstituted these genes in the genome of  P. paxilli  to enable heterologous production of NAF in this fungus.

RELATED APPLICATIONS

The present patent application is a divisional of U.S. patentapplication Ser. No. 16/651,065 filed Mar. 26, 2020, which is a 35U.S.C. 371 U.S. National Phase application of International PatentApplication No. PCT/IB2018/057528, which was filed Sep. 28, 2018,claiming the benefit of priority to Australian Patent Application No.2017903956 filed on Sep. 29, 2017. The entire text of the aforementionedapplications is incorporated herein by reference in its entirety.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (sequencelisting.xml;Size: 118,571 bytes; and Date of Creation: Aug. 1, 2022) is hereinincorporated by reference in its entirety.

FIELD OF THE INVENTION

This invention generally relates to novel polypeptides that catalyze atleast one biochemical reaction leading to the production of anodulisporic acid (NA), polynucleotides encoding such polypeptides,methods of making such polypeptides and polynucleotides, and methods ofusing such polypeptides and polynucleotides to produce at least one NAby heterologous expression in a permissive host.

BACKGROUND

Filamentous fungi produce a diverse repertoire of interesting and usefulchemical compounds. Members of one such class of compounds, the indolediterpenes (IDTs), are of particular interest due to their wide range ofchemical diversity and concomitant bioactivities, which includeanti-MRSA,¹ anti-cancer,^(2,3) anti-H1N1,⁴ insecticidal⁵ andtremorgenic⁶ activities. NAs (FIG. 1 ) are a group of notably bioactivequasi-paspaline-like IDTs produced by Hypoxylon pulicicidum, formerlyclassified as Nodulisporium sp.⁷ Nodulisporic acid A (NAA) 10 is ofparticular significance because it exhibits highly potent insecticidalactivity against blood-feeding arthropods while exhibiting no observableadverse effects on mammals.^(5,8)

NAs are especially difficult to biosynthesize from the natural producer,H. pulicicidum. Reported NA biosynthesis methods require that H.pulicicidum be grown for 21 days in complete darkness in highly nutrientrich media.⁹ Due to the difficulty of NAA 10 biosynthesis in H.pulicicidum, obtaining useful quantities of NAA 10 using publishedfermentations methods is challenging, and production of commercialquantities of NAA 10 essentially unachievable. Accordingly, attemptshave been made to chemically synthesize NAA 10 resulting in mechanismsfor the synthesis of nodulisporic acid F (NAF) 5a¹⁰ and nodulisporicacid D 7a,¹¹ but full synthesis of NAA 10 has not been achieved.¹²Consequently there is a need in the art for new methods of NAA 10synthesis and/or biosynthesis that will provide useful quantities of NAA10.

It is an object of the present invention to provide a polynucleotideencoding at least one enzyme in the NAA 10 biosynthesis pathway of H.pulicicidum and/or to provide a method of using such a vector to produceat least one indole diterpene compound that is a NA and/or to produce aprecursor to NAA 10 in a heterologous host and/or to at least providethe public with a useful choice.

In this specification where reference has been made to patentspecifications, other external documents, or other sources ofinformation, this is generally for the purpose of providing a contextfor discussing the features of the invention. Unless specifically statedotherwise, reference to such external documents is not to be construedas an admission that such documents, or such sources of information, inany jurisdiction, are prior art, or form part of the common generalknowledge in the art.

SUMMARY OF THE INVENTION

In one aspect the invention relates to an isolated polypeptidecomprising an amino acid sequence selected from the group consisting ofNodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ IDNO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21),NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1(SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ IDNO:50), and NodI (SEQ ID NO:56) or a functional variant or fragmentthereof.

In another aspect the invention relates to an isolated polynucleotideencoding a polypeptide comprising an amino acid sequence selected fromthe group consisting of NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX(SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ IDNO:18), NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27),NodD2 (SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ(SEQ ID NO:39), NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or afunctional variant or fragment thereof.

In another aspect the invention relates to an isolated polynucleotidecomprising at least 70% nucleic acid sequence identity to a nucleic acidsequence selected from the group consisting of nodW cDNA (SEQ ID NO:2),nodW genomic DNA (SEQ ID NO:1), nodR cDNA (SEQ ID NO:5), nodR genomicDNA (SEQ ID NO:4), nodX cDNA (SEQ ID NO:8), nodX genomic DNA (SEQ IDNO:7), nodM cDNA (SEQ ID NO:11), nodM genomic DNA (SEQ ID NO:10), nodBcDNA (SEQ ID NO:14), nodB genomic DNA (SEQ ID NO:13), nodO cDNA (SEQ IDNO:17), nodO genomic DNA (SEQ ID NO:16), nodJ cDNA (SEQ ID NO:20), nodJgenomic DNA (SEQ ID NO:19), nodC cDNA (SEQ ID NO:23), nodC genomic DNA(SEQ ID NO:22), nodY1 cDNA (SEQ ID NO:26), nodY1 genomic DNA (SEQ IDNO:25), nodD2 cDNA (SEQ ID NO:29), nodD2 genomic DNA (SEQ ID NO:28),nodD1 cDNA (SEQ ID NO:32), nodD1 genomic DNA (SEQ ID NO:31), nodY2 cDNA(SEQ ID NO:35), nodY2 genomic DNA (SEQ ID NO:34), nodZ cDNA (SEQ IDNO:38), nodZ genomic DNA (SEQ ID NO:37), nodS cDNA (SEQ ID NO:49), nodSgenomic DNA (SEQ ID NO:48), nodI cDNA (SEQ ID NO:55), and nodI genomicDNA (SEQ ID NO:54).

In another aspect the invention relates to a transcription unit (TU)comprising at least one isolated polynucleotide according to theinvention.

In another aspect the invention relates to a vector that encodes anisolated polypeptide according to the invention.

In another aspect the invention relates to a vector comprising anisolated nucleic acid sequence or a TU according to the invention.

In another aspect the invention relates to an isolated host cellcomprising an isolated polypeptide, isolated polynucleotide, TU and/orvector according to the invention.

In another aspect the invention relates to a method of making at leastone NA comprising heterologously expressing at least one polypeptide,isolated nucleic acid sequence, TU or vector according to the inventionin an isolated host cell.

In another aspect the invention relates to at least one NA made by amethod of the invention.

In another aspect the present invention relates to an isolatedpolypeptide or functional fragment or variant thereof from Hypoxylonspp. that catalyzes a biochemical reaction in the biosynthetic pathwayleading from 3-geranylgeranyl indole (GGI) 2 to NAA 10.

In another aspect the present invention relates to an isolatedpolynucleotide encoding at least one polypeptide or functional variantor fragment thereof from Hypoxylon spp. that catalyzes a biochemicalreaction in the biosynthetic pathway leading from GGI 2 to NAA 10.

In another aspect the invention relates to a method of making at leastone Hypoxylon spp. polypeptide or functional variant or fragment thereofcomprising heterologously expressing an isolated nucleic acid sequenceor vector according to the invention in an isolated host cell.

In another aspect the invention relates to a method of making at leastone NA comprising heterologously expressing in an isolated host cell, atleast one polypeptide that catalyzes a biochemical reaction in thebiosynthetic pathway leading from GGI 2 to NAA 10.

In another aspect the invention relates to an isolated host cell thatexpresses at least one heterologous polypeptide that catalyzes thetransformation of a substrate in the biosynthetic pathway leading fromGGI 2 to the formation of NAA 10.

In another aspect the invention relates to an isolated host cell thatproduces by heterologous expression, at least one polypeptide involvedin the biosynthetic pathway leading from GGI 2 to NAA 10.

In another aspect the invention relates to a method of producing atleast one NA comprising contacting a carbohydrate comprising substratewith a recombinant cell transformed with a nucleic acid that results inan increased level of activity of a polypeptide selected from the groupconsisting of NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ IDNO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18),NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2(SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ IDNO:39), NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or a functionalvariant or fragment thereof compared to the cell prior totransformation, such that the substrate is metabolized to at least oneNA.

In another aspect the invention relates to an isolated strain ofHypoxylon pulicicidum that comprises at least one heterologous nucleicacid sequence encoding an enzyme in a biosynthetic pathway leading toNAA10.

In another aspect the invention relates to an isolated strain ofHypoxylon pulicicidum that expresses at least two different GGPPSenzymes.

In another aspect the invention relates to an isolated strain ofHypoxylon pulicicidum that comprises a genetic modification that leadsto an increased biosynthesis of NAA 10.

In another aspect the invention relates to a method of making NAA 10comprising expressing at least one heterologous nucleic acid sequence inHypoxylon pulicicidum, wherein the at least one heterologous nucleicacid sequence encodes an enzyme in a biosynthetic pathway leading to NAA10.

Various embodiments of the different aspects of the invention asdiscussed above are also set out below in the detailed description ofthe invention, but the invention is not limited thereto.

Other aspects of the invention may become apparent from the followingdescription which is given by way of example only and with reference tothe accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described with reference to the figures in theaccompanying drawings.

FIG. 1 : Collection of known nodulisporic acids (NAs).

FIG. 2 : Branch points in the biosynthetic pathway of indole diterpenes(IDTs) that give rise to the diverse array of IDT structures. Arrowsrepresent enzymatic steps in IDT biosynthesis.

FIG. 3 : HPLC analysis (271 nm) of extracts of P. paxilli knockout (KO)strains (in gray (—)) expressing different H. pulicicidum (Nod) enzymesand/or P. paxilli (Pax) enzymes (in black (—)). A black X covers theenzyme(s) that are not expressed in the P. paxilli KO strains (tracesi.a, ii.a, iiia, iv.a, and v.a). The enzyme(s) that have been newlyexpressed in the P. paxilli KO strain are depicted below thecorresponding KO strain and next to their UV traces (i.b, ii.b, iv.b,and v.b). Notably there is a compound that elutes at the same retentiontime as emindole SB 4a, but emindole SB 4a is only present in threetraces (ii.b, iii.b, and v.b) as confirmed by corresponding 406.31±0.01m/z EICs (FIGS. 5, 6, and 9 ). Traces correspond to fungal extracts asfollows: i.a=PN2290, i.b=pKV27:PN2690, ii.a=PN2257, ii.b=pKV63:PN2257,iii.a=PN2250, iii.b=pSK66:PN2250, iv.a=PN2250, iv.b=pKV74:PN2250,v.a=PN2257, v.b=pKV64:PN2257.

FIG. 4 : Extracted ion chromatogram for pKV27:PN2290 (nodC:ΔpaxC)showing MS peak for paxilline 6b (5.3 min, 436.248±0.01 m/z).

FIG. 5 : Extracted ion chromatograms for pKV63:PN2257 (nodM:ΔpaxM)showing MS peak for emindole SB 4a (19.3 min, 406.31±0.01 m/z) but notpaspaline 4b (17.6 min, 422.305±0.01 m/z).

FIG. 6 : Extracted ion chromatograms for pSK66:PN2250 (paxG, nodC, nodM,and nodB:ΔPAX cluster) showing MS peak for emindole SB 4a (19.3 min,406.31±0.01 m/z) but not paspaline 4b (17.6 min, 422.305±0.01 m/z).

FIG. 7 : Extracted ion chromatogram for pKV74:PN2250 (paxG, paxC, paxM,nodB:ΔPAX cluster) showing MS peak for paspaline 4b (17.6 min,422.305±0.01 m/z) but not emindole SB 4a (19.3 min, 406.31±0.01 m/z).

FIG. 8 : Depiction of the predicted NA gene cluster from H. pulicicidum(A) and the NAF 5a biosynthetic pathway (B). Arrows represent individualgenes and arrow decorations represent gene function. Figure is not toexact scale and does not include exon/intron structure. Notably the genecluster lacks a GGPPS responsible for the first secondary-metabolic stepin IDT synthesis.

FIG. 9 : Extracted ion chromatograms for pKV64:PN2257 (nodM andnodW:ΔpaxM) showing MS peaks for emindole SB 4a (19.3 min, 406.31±0.01m/z) and NAF 5a (6.2 min, 436.284±0.01 m/z).

FIG. 10 : Extracted ion chromatograms for pSK68:PN2250 (paxG, nodC,nodM, nodB, and nodW:ΔPAX cluster) showing MS peaks for emindole SB 4a(19.3 min, 406.31±0.01 m/z) and NAF 5a (6.2 min, 436.284±0.01 m/z).

FIG. 11 : Overview of MIDAS Level-1 cloning.

FIG. 12 : MIDAS module address system.

FIG. 13 : Overview of MIDAS cassettes.

FIG. 14 : Principle of MIDAS multigene assembly (level-3).

FIG. 15 : Overview of MIDAS format.

DETAILED DESCRIPTION OF THE INVENTION Definitions

The term “comprising” as used in this specification and claims means“consisting at least in part of”; that is to say when interpretingstatements in this specification and claims which include “comprising”,the features prefaced by this term in each statement all need to bepresent but other features can also be present. Related terms such as“comprise” and “comprised” are to be interpreted in similar manner.

The term “consisting essentially of” as used herein means the specifiedmaterials or steps and those that do not materially affect the basic andnovel characteristic(s) of the claimed invention.

The term “consisting of” as used herein means the specified materials orsteps of the claimed invention, excluding any element, step, oringredient not specified in the claim.

The terms “recognition site” and “restriction site” are usedinterchangeably herein and mean the same thing. These terms as usedherein with reference to a restriction enzyme mean the nucleic acidsequence or sequences of a polynucleotide that define the binding siteon the polynucleotide for a given restriction enzyme.

The term “indole diterpene (IDT) compound” or “indole diterpenoid”refers to any compound derived from an indole containing precursor,preferably indole-3-glycerol phosphate 1b, and geranylgeranylpyrophosphate (GGPP) 1a.

In some embodiments an IDT compound is selected from the groupconsisting of GGI 2, emindole SB 4a, and NAF 5a.

The term “genetic construct” refers to a polynucleotide molecule,usually double-stranded DNA, which has been conjugated to anotherpolynucleotide molecule. In one non-limiting example a genetic constructis made by inserting a first polynucleotide molecule into a secondpolynucleotide molecule, for example by restriction/ligation as known inthe art. In some embodiments, a genetic construct comprises a singlepolynucleotide module, at least two polynucleotide modules, or a seriesof multiple polynucleotide modules assembled into a single contiguouspolynucleotide molecule (also referred to herein as a “multigeneconstruct”), but not limited thereto.

The term “genetic construct” refers to a polynucleotide molecule,usually double-stranded DNA, which has been conjugated to anotherpolynucleotide molecule. In one non-limiting example a genetic constructis made by inserting a first polynucleotide molecule into a secondpolynucleotide molecule, for example by restriction/ligation as known inthe art. In some embodiments, a genetic construct comprises a singlepolynucleotide module, at least two polynucleotide modules, or a seriesof multiple polynucleotide modules assembled into a single contiguouspolynucleotide molecule (also referred to herein as a “multigeneconstruct”), but not limited thereto.

A genetic construct may contain the necessary elements that permittranscription of a polynucleotide molecule, and, optionally, fortranslating the transcript into a polypeptide. A polynucleotide moleculecomprised in and/or by the gene construct may be derived from the hostcell, or may be derived from a different cell or organism and/or may bea recombinant polynucleotide. Once inside the host cell the geneticconstruct may become integrated in the host chromosomal DNA. The geneticconstruct may be linked to a vector.

The term “transcription unit” (TU) as used herein refers to apolynucleotide comprising a sequence of nucleotides that code for asingle RNA molecule including all the nucleotide sequences necessary fortranscription of the single RNA molecule, including a promoter, anRNA-coding sequence, and a terminator, but not limited thereto.

The term “transcription unit module” (TUM) as used herein refers to apolynucleotide comprising a sequence of nucleotides that encode a singleRNA molecule, or parts thereof; or that encode a protein coding sequence(CDS), or parts thereof; or that encode sequence elements, or partsthereof, that control transcription of that RNA molecule; or that encodesequence elements or parts thereof that control translation of the CDS.Such sequence elements may include, but are not limited to, promoters,untranslated regions (UTRs), terminators, polyadenylation signals,ribosome binding sites, transcriptional enhancers and translationalenhancers.

The term “multigene construct” as used herein means a genetic constructthat is a polynucleotide comprising at least two TUs.

The term “marker” as used herein means a nucleic acid sequence in apolynucleotide that encodes a selectable marker or scorable marker.

The term “selectable marker” as used herein refers to a TU, which whenintroduced into a cell, confers at least one trait on the cell thatallows the cell to be selected based on the presence or absence of thattrait. In one embodiment the cell is selected based on survival underconditions that kill cells not comprising the at least one selectablemarker.

The term “scorable marker” as used herein refers to a TU, which whenintroduced into a cell, confers at least one trait on the cell thatallows the cell to be scored based on the presence or absence of thattrait. In one embodiment the cell comprising the TU is scored byidentifying the cell phenotypically from a plurality of cells.

The term “genetic element” as used herein refers to any polynucleotidesequence that is not a TU or does not form part of a TU. Suchpolynucleotide sequences may include, but are not limited to origins ofreplication for plasmids and viruses, centromeres, telomeres, repeatsequences, sequences used for homologous recombination, site-specificrecombination sequences, and sequences controlling DNA transfer betweenorganisms.

The term “vector” as used herein refers to any type of polynucleotidemolecule that may be used to manipulate genetic material so that it canbe amplified, replicated, manipulated, partially replicated, modifiedand/or expressed, but not limited thereto. In some embodiments a vectormay be used to transport a polynucleotide comprised in that vector intoa cell or organism.

The term “source vector” as used herein refers to a vector into whichpolynucleotide sequences of interest can be cloned. In some embodimentsthe polynucleotide sequences are TUs and TUMs as described herein. Insome embodiments a source vector is selected from the group consistingof plasmids, bacterial artificial chromosomes (BACs), phage artificialchromosomes (PACs), yeast artificial chromosomes (YACs), bacteriophage,phagemids, and cosmids. In some embodiments, a source vector comprisinga polynucleotide sequence of interest is termed an entry clone. In someembodiments the entry clone can serve as a shuttle or destination vectorfor receiving further polynucleotide sequences.

The term “shuttle vector” as used herein refers to a vector into whichpolynucleotide sequences of interest can be cloned and from which theycan be manipulated. In some embodiments the polynucleotide sequences areTUs and TUMs as described herein. In some embodiments a shuttle vectoris selected from the group consisting of plasmids, BACs, PACs, YACs,bacteriophage, phagemids, and cosmids. In some embodiments, a shuttlevector comprising a polynucleotide sequence of interest can serve as adestination vector for receiving further polynucleotide sequences.

The term “destination vector” as used herein refers to a vector intowhich polynucleotide sequences of interest can be cloned. In someembodiments the polynucleotide sequences are TUs and TUMs as describedherein. In some embodiments a destination vector is selected from thegroup consisting of plasmids, BACs, PACs, YACs, bacteriophage,phagemids, and cosmids. In some embodiments, a destination vectorcomprising a polynucleotide sequence of interest is an entry clone. Insome embodiments the entry clone can serve as a destination vector forreceiving further polynucleotide sequences.

The term “polynucleotide(s),” as used herein, means a single ordouble-stranded deoxyribonucleotide or ribonucleotide polymer of anylength, and include as non-limiting examples, coding and non-codingsequences of a gene, sense and antisense sequences, exons, introns,genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes,recombinant polynucleotides, isolated and purified naturally occurringDNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acidprobes, primers, fragments, genetic constructs, vectors and modifiedpolynucleotides. Reference to nucleic acids, nucleic acid molecules,nucleotide sequences and polynucleotide sequences is to be similarlyunderstood.

The term “gene” as used herein refers to gene the biologic unit ofheredity, self-reproducing and located at a definite position (locus) ona particular chromosome. In one embodiment the particular chromosome isa eukaryotic or bacterial chromosome. The term bacterial chromosome isused interchangeably herein with the term bacterial genome.

The term “gene cluster” as used herein refers to a group of geneslocated closely together on the same chromosome whose products play acoordinated role in a specific aspect of cellular primary or secondarymetabolism. In one example a gene cluster comprises a group of CDSs theproducts of which all participate in a series of biochemical reactionsthat comprise the biosynthetic pathway or array that produces a givenmetabolite, particularly a secondary metabolite.

The term “secondary metabolite” as used herein refers to compounds thatare not involved in primary metabolism, and therefore differ from themore prevalent macromolecules such as proteins and nucleic acids thatmake up the basic machinery of life.

The terms “under conditions wherein the . . . enzyme is active” and“under conditions wherein the . . . enzymes are active”, and grammaticalvariations thereof when used in reference to enzyme activity mean thatthe enzyme will perform it's expected function; e.g., a restrictionendonuclease will cleave a nucleic acid at an appropriate restrictionsite, and a DNA ligase will covalently join two polynucleotidestogether.

The term “endogenous” as used herein refers to a constituent of a cell,tissue or organism that originates or is produced naturally within thatcell, tissue or organism. An “endogenous” constituent may be anyconstituent including but not limited to a polynucleotide, a polypeptideincluding a non-ribosomal polypeptide, a fatty acid or a polyketide, butnot limited thereto.

The term “exogenous” as used herein refers to any constituent of a cell,tissue or organism that does not originate or is not produced naturallywithin that cell, tissue or organism. An exogenous constituent may be,for example, a polynucleotide sequence that has been introduced into acell, tissue or organism, or a polypeptide expressed in that cell,tissue or organism from that polynucleotide sequence.

“Naturally occurring” as used herein with reference to a polynucleotidesequence according to the invention refers to a primary polynucleotidesequence that is found in nature. A synthetic polynucleotide sequencethat is identical to a wild polynucleotide sequence is, for the purposesof this disclosure, considered a naturally occurring sequence. What isimportant for a naturally occurring polynucleotide sequence is that theactual sequence of nucleotide bases that comprise the polynucleotide isfound or known from nature.

For example, a wild type polynucleotide sequence is a naturallyoccurring polynucleotide sequence, but not limited thereto. A naturallyoccurring polynucleotide sequence also refers to variant polynucleotidesequences as found in nature that differ from wild type. For example,allelic variants and naturally occurring recombinant polynucleotidesequences due to hybridization or horizontal gene transfer, but notlimited thereto.

“Non-naturally occurring” as used herein with reference to apolynucleotide sequence according to the invention refers to apolynucleotide sequence that is not found in nature. Examples ofnon-naturally occurring polynucleotide sequences include artificiallyproduced mutant and variant polynucleotide sequences, made for exampleby point mutation, insertion, or deletion, but not limited thereto.Non-naturally occurring polynucleotide sequences also include chemicallyevolved sequences. What is important for a non-naturally occurringpolynucleotide sequence according to the invention is that the actualsequence of nucleotide bases that comprise the polynucleotide is notfound or known from nature.

The term, “wild type” when used herein with reference to apolynucleotide refers to a naturally occurring; non-mutant form of apolynucleotide. A mutant polynucleotide means a polynucleotide that hassustained a mutation as known in the art, such as point mutation,insertion, deletion, substitution, amplification or translocation, butnot limited thereto.

The term, “wild type” when used herein with reference to a polypeptiderefers to a naturally occurring, non-mutant form of a polypeptide. Awild type polypeptide is a polypeptide that is capable of beingexpressed from a wild type polynucleotide.

The term “coding sequence” or “open reading frame” (ORF) refers to thesense strand of a genomic DNA sequence or a cDNA sequence that iscapable of producing a transcription product and/or a polypeptide underthe control of appropriate regulatory sequences. The CDS is identifiedby the presence of a 5′ translation start codon and a 3′ translationstop codon. When inserted into a genetic construct or an expressioncassette, a “coding sequence” (CDS) is capable of being expressed whenit is operably linked to a promoter sequence and/or other regulatoryelements.

“Operably-linked” means that the sequence to be expressed is placedunder the control of regulatory elements.

“Regulatory elements” as used herein refers to any nucleic acid sequenceelement that controls or influences the expression of a polynucleotideinsert from a vector, genetic construct or expression cassette andincludes promoters, transcription control sequences, translation controlsequences, origins of replication, tissue-specific regulatory elements,temporal regulatory elements, enhancers, polyadenylation signals,repressors and terminators. Regulatory elements can be “homologous” or“heterologous” to the polynucleotide insert to be expressed from agenetic construct, expression cassette or vector as described herein.When a genetic construct, expression cassette or vector as describedherein is present in a cell, a regulatory element can be “endogenous”,“exogenous”, “naturally occurring” and/or “non-naturally occurring” withrespect to cell.

The term “noncoding region” refers to untranslated sequences that areupstream of the translational start site and downstream of thetranslational stop site. These sequences are also referred torespectively as the 5′ UTR and the 3′ UTR. These regions includeelements required for transcription initiation and termination and forregulation of translation efficiency.

Terminators are sequences, which terminate transcription, and are foundin the 3′ untranslated ends of genes downstream of the translatedsequence. Terminators are important determinants of mRNA stability andin some cases have been found to have spatial regulatory functions.

The term “promoter” refers to nontranscribed cis-regulatory elementsupstream of the coding region that regulate the transcription of apolynucleotide sequence. Promoters comprise cis-initiator elements whichspecify the transcription initiation site and conserved boxes. In onenon-limiting example, bacterial promoters may comprise a “Pribnow box”(also known as the −10 region), and other motifs that are bound bytranscription factors and promote transcription. Promoters can behomologous or heterologous with respect to polynucleotide sequence to beexpressed. When the polynucleotide sequence is to be expressed in acell, a promoter may be an endogenous or exogenous promoter. Promoterscan be constitutive promoters, inducible promoters or regulatablepromoters as known in the art.

“Homologous” as used herein with reference to polynucleotide regulatoryelements, means a polynucleotide regulatory element that is a native andnaturally-occurring polynucleotide regulatory element. A homologouspolynucleotide regulatory element may be operably linked to apolynucleotide of interest such that the polynucleotide of interest canbe expressed from a TU, genetic element or vector according to theinvention.

“Homologous” as used herein with reference to polynucleotide orpolypeptide in a host organism means that the polynucleotide orpolypeptide is a native and naturally-occurring polynucleotide orpolynucleotide within that host organism. A homologous polynucleotidemay be operably linked to a homologous or heterologous regulatoryelement so that a homologous polypeptide may be expressed from a TU,genetic element or vector comprising the homologous polynucleotide asdescribed herein.

“Introduced Homologous” as used herein with reference to polynucleotideor polypeptide in a host organism means that the polynucleotide orpolypeptide is a native and naturally-occurring polynucleotide orpolynucleotide within that host organism that has been introduced intothe organism by experimental techniques. A introduced homologouspolynucleotide may be operably linked to a homologous or heterologousregulatory element so that a homologous polypeptide may be expressedfrom a TU, genetic element or vector comprising the homologouspolynucleotide as described herein.

“Heterologous” as used herein with reference to polynucleotideregulatory elements, means a polynucleotide regulatory element that isnot a native and naturally-occurring polynucleotide regulatory element.A heterologous polynucleotide regulatory element is not normallyassociated with the CDS to which it is operably linked. A heterologousregulatory element may be operably linked to a polynucleotide ofinterest such that the polynucleotide of interest can be expressed froma, vector, genetic construct or expression cassette according to theinvention. Such promoters may include promoters normally associated withother genes, ORFs or coding regions, and/or promoters isolated from anyother bacterial, viral, eukaryotic, or mammalian cell.

“Heterologous” as used herein with reference to a polynucleotide orpolypeptide in a host organism means a polynucleotide or polypeptidethat is not a native and naturally-occurring polynucleotide orpolypeptide in that host organism. A heterologous polynucleotide may beoperably linked to a heterologous or homologous regulatory element sothat a heterologous polypeptide may be expressed from a TU, geneticelement or vector comprising the heterologous polynucleotide asdescribed herein.

The terms “heterologously expressing” and “heterologous expression” meanthe expression of a heterologous polypeptide in a host cell.

A “biochemical reaction in the biosynthetic pathway leading from GGI 2to NAA 10” means one of the specific reactions catalyzed by one of thespecific enzymes involved in transforming the substrate molecule GGI 2through the following intermediates: mono-expoxidized GGI 3a, emindoleSB 4a, NAF 5a, NAE 6a, NAD 7a, NAC 8, NAB 9, to NAA 10, and does notinclude similar enzymes within a host cell that may have similarfunctions but that do not act on the particular named intermediatesabove.

A “functional variant or fragment thereof” of a polypeptide is asubsequence of the polypeptide that performs a function that is requiredfor the biological activity or binding of that polypeptide and/orprovides the three dimensional structure of the polypeptide. The termmay refer to a polypeptide, an aggregate of a polypeptide such as adimer or other multimer, a fusion polypeptide, a polypeptide fragment, apolypeptide variant, or functional polypeptide derivative thereof thatis capable of performing the polypeptide activity.

“Isolated” as used herein with reference to polynucleotide orpolypeptide sequences describes a sequence that has been removed fromits natural cellular environment. An isolated molecule may be obtainedby any method or combination of methods as known and used in the art,including biochemical, recombinant, and synthetic techniques. Thepolynucleotide or polypeptide sequences may be prepared by at least onepurification step.

“Isolated” when used herein in reference to a cell or host celldescribes to a cell or host cell that has been obtained or removed froman organism or from its natural environment and is subsequentlymaintained in a laboratory environment as known in the art. The termencompasses single cells, per se, as well as cells or host cellscomprised in a cell culture and can include a single cell or single hostcell.

The term “isolated host cell” as used herein with reference to a fungalhost cell encompasses single cells of unicellular fungi and the hyphaeand mycelia of filamentous fungi including septate and non-septateforms.

The term “recombinant” refers to a polynucleotide sequence that isremoved from sequences that surround it in its natural context and/or isrecombined with sequences that are not present in its natural context. A“recombinant” polypeptide sequence is produced by translation from a“recombinant” polynucleotide sequence.

As used herein, the term “variant” refers to polynucleotide orpolypeptide sequences different from the specifically identifiedsequences, wherein one or more nucleotides or amino acid residues isdeleted, substituted, or added. Variants may be naturally occurringallelic variants, or non-naturally occurring variants. Variants may befrom the same or from other species and may encompass homologues,paralogues and orthologues. In certain embodiments, variants of thepolypeptides useful in the invention have biological activities that arethe same or similar to those of a corresponding wild type molecule;i.e., the parent polypeptides or polynucleotides.

In certain embodiments, variants of the polypeptides described hereinhave biological activities that are similar, or that are substantiallysimilar to their corresponding wild type molecules. In certainembodiments the similarities are similar activity and/or bindingspecificity.

In certain embodiments, variants of polypeptides described herein havebiological activities that differ from their corresponding wild typemolecules. In certain embodiments the differences are altered activityand/or binding specificity.

The term “variant” with reference to polynucleotides and polypeptidesencompasses all forms of polynucleotides and polypeptides as definedherein.

Variant polynucleotide sequences preferably exhibit at least 50%, atleast 60%, preferably at least 70%, preferably at least 71%, preferablyat least 72%, preferably at least 73%, preferably at least 74%,preferably at least 75%, preferably at least 76%, preferably at least77%, preferably at least 78%, preferably at least 79%, preferably atleast 80%, preferably at least 81%, preferably at least 82%, preferablyat least 83%, preferably at least 84%, preferably at least 85%,preferably at least 86%, preferably at least 87%, preferably at least88%, preferably at least 89%, preferably at least 90%, preferably atleast 91%, preferably at least 92%, preferably at least 93%, preferablyat least 94%, preferably at least 95%, preferably at least 96%,preferably at least 97%, preferably at least 98%, and preferably atleast 99% identity to a sequence of the present invention. Identity isfound over a comparison window of at least 8 nucleotide positions,preferably at least 10 nucleotide positions, preferably at least 15nucleotide positions, preferably at least 20 nucleotide positions,preferably at least 27 nucleotide positions, preferably at least 40nucleotide positions, preferably at least 50 nucleotide positions,preferably at least 60 nucleotide positions, preferably at least 70nucleotide positions, preferably at least 80 nucleotide positions,preferably over the entire length of a polynucleotide used in oridentified according to a method of the invention.

Polynucleotide variants also encompass those which exhibit a similarityto one or more of the specifically identified sequences that is likelyto preserve the functional equivalence of those sequences and whichcould not reasonably be expected to have occurred by random chance.

Polynucleotide sequence identity and similarity can be determinedreadily by those of skill in the art.

Variant polynucleotides also encompasses polynucleotides that differfrom the polynucleotide sequences described herein but that, as aconsequence of the degeneracy of the genetic code, encode a polypeptidehaving similar activity to a polypeptide encoded by a polynucleotide ofthe present invention. A sequence alteration that does not change theamino acid sequence of the polypeptide is a “silent variation”. Exceptfor ATG (methionine) and TGG (tryptophan), other codons for the sameamino acid may be changed by art recognized techniques, e.g., tooptimize codon expression in a particular host organism.

Polynucleotide sequence alterations resulting in conservativesubstitutions of one or several amino acids in the encoded polypeptidesequence without significantly altering its biological activity are alsoincluded in the invention. A skilled artisan will be aware of methodsfor making phenotypically silent amino acid substitutions (see, e.g.,Bowie et al., 1990, Science 247, 1306).

The term “variant” with reference to polypeptides also encompassesnaturally occurring, recombinantly and synthetically producedpolypeptides. Variant polypeptide sequences preferably exhibit at least35%, preferably at least 40%, preferably at least 50%, preferably atleast 60%, preferably at least 70%, preferably at least 71%, preferablyat least 72%, preferably at least 73%, preferably at least 74%,preferably at least 75%, preferably at least 76%, preferably at least77%, preferably at least 78%, preferably at least 79%, preferably atleast 80%, preferably at least 81%, preferably at least 82%, preferablyat least 83%, preferably at least 84%, preferably at least 85%,preferably at least 86%, preferably at least 87%, preferably at least88%, preferably at least 89%, preferably at least 90%, preferably atleast 91%, preferably at least 92%, preferably at least 93%, preferablyat least 94%, preferably at least 95%, preferably at least 96%,preferably at least 97%, preferably at least 98%, and preferably atleast 99% identity to a sequence of the present invention. Identity isfound over a comparison window of at least 2 amino acid positions,preferably at least 3 amino acid positions, preferably at least 4 aminoacid positions, preferably at least 5 amino acid positions, preferablyat least 7 amino acid positions, preferably at least 10 amino acidpositions, preferably at least 15 amino acid positions, preferably atleast 20 amino acid positions, preferably over the entire length of apolypeptide used in or identified according to a method of theinvention.

Polypeptide variants also encompass those which exhibit a similarity toone or more of the specifically identified sequences that is likely topreserve the functional equivalence of those sequences and which couldnot reasonably be expected to have occurred by random chance.

Polypeptide sequence identity and similarity can be determined readilyby those of skill in the art.

A variant polypeptide includes a polypeptide wherein the amino acidsequence differs from a polypeptide herein by one or more conservativeamino acid or non-conservative substitutions, deletions, additions orinsertions which do not affect the biological activity of the peptide.

Conservative substitutions typically include the substitution of oneamino acid for another with similar characteristics, e.g., substitutionswithin the following groups: valine, glycine; glycine, alanine; valine,isoleucine, leucine; aspartic acid, glutamic acid; asparagine,glutamine; serine, threonine; lysine, arginine; and phenylalanine,tyrosine.

Analysis of evolved biological sequences has shown that not all sequencechanges are equally likely, reflecting at least in part the differencesin conservative versus non-conservative substitutions at a biologicallevel. For example, certain amino acid substitutions may occurfrequently, whereas others are very rare. Evolutionary changes orsubstitutions in amino acid residues can be modelled by a scoring matrixalso referred to as a substitution matrix. Such matrices are used inbioinformatics analysis to identify relationships between sequences andare known to the skilled worker.

Other variants include peptides with modifications which influencepeptide stability. Such analogs may contain, for example, one or morenon-peptide bonds (which replace the peptide bonds) in the peptidesequence. Also included are analogs that include residues other thannaturally occurring L-amino acids, e.g. D-amino acids or non-naturallyoccurring synthetic amino acids, e.g. beta or gamma amino acids andcyclic analogs.

Substitutions, deletions, additions or insertions may be made bymutagenesis methods known in the art. A skilled worker will be aware ofmethods for making phenotypically silent amino acid substitutions. Seefor example Bowie et al., 1990, Science 247, 1306.

A polypeptide as used herein can also refer to a polypeptide that hasbeen modified during or after synthesis, for example, by biotinylation,benzylation, glycosylation, phosphorylation, amidation, byderivatization using blocking/protecting groups and the like. Suchmodifications may increase stability or activity of the polypeptide.

The terms “modulate(s) expression”, “modulated expression” and“modulating expression” of a polynucleotide or polypeptide, are intendedto encompass the situation where genomic DNA corresponding to apolynucleotide to be expressed according to the invention is modifiedthus leading to modulated expression of a polynucleotide or polypeptideof the invention. Modification of the genomic DNA may be through genetictransformation or other methods known in the art for inducing mutations.The “modulated expression” can be related to an increase or decrease inthe amount of messenger RNA and/or polypeptide produced and may alsoresult in an increase or decrease in the activity of a polypeptide dueto alterations in the sequence of a polynucleotide and polypeptideproduced.

The terms “modulate(s) activity”, “modulated activity” and “modulatingactivity” of a polynucleotide or polypeptide, are intended to encompassthe situation where genomic DNA corresponding to a polynucleotide to beexpressed according to the invention is modified thus leading tomodulated expression of a polynucleotide or modulated expression oractivity of polypeptide of the invention. Modification of the genomicDNA may be through genetic transformation or other methods known in theart for inducing mutations. The “modulated activity” can be related toan increase or decrease in the amount of messenger RNA and/orpolypeptide produced and may also result in an increase or decrease inthe functional activity of a polypeptide due to alterations in thesequence of a polynucleotide and polypeptide produced.

It is intended that reference to a range of numbers disclosed herein(for example 1 to 10) also incorporates reference to all related numberswithin that range (for example, 1, 1.1, 2, 3, 3.9, 4, 5, 6, 6.5, 7, 8, 9and 10) and also any range of rational numbers within that range (forexample 2 to 8, 1.5 to 5.5 and 3.1 to 4.7) and, therefore, allsub-ranges of all ranges expressly disclosed herein are expresslydisclosed. These are only examples of what is specifically intended andall possible combinations of numerical values between the lowest valueand the highest value enumerated are to be considered to be expresslystated in this application in a similar manner.

DETAILED DESCRIPTION

Since the identification of the biosynthetic pathway for the IDTpaxilline 6b^(6,13-16) in Penicillium paxilli, gene functionality inseven other IDT biosynthetic pathways has been elucidated.¹⁷⁻²² TheseIDT pathways share homologous genes that encode enzymes for the firstthree steps in IDT biosynthesis (FIG. 2 ): (I) a geranylgeranylpyrophosphate synthase (GGPPS), converts farnesyl pyrophosphate andisopentyl pyrophosphate into GGPP 1a, (II) a geranylgeranyl transferase(GGT), catalyzes the indole condensation of GGPP 1a andindole-3-glycerol phosphate 1b to make GGI 2, and (III) a regioselectiveflavin adenine dinucleotide (FAD) dependent epoxidase, creates thesingle and/or double epoxidized-GGI products 3a/3b. At the fourthenzymatic step, involving IDT cyclization, the pathways diverge intofour key branches giving rise to mono/di-oxygenatedanti-Markovnikov-derived cyclic cores like emindole SB 4a and paspaline4b, or mono/di-oxygenated Markovnikov-derived cyclic cores likeaflavinine 4c and the emindole DB 4d. These cyclic cores are oftenfurther modified with decorative enzymes that create the bioactivediversity seen across IDTs.

NAs are bioactive IDTs produced by Hypoxylon pulicicidum, with NAA 10being of particular significance due to its highly potent insecticidalactivity against blood-feeding arthropods and lack of mammaliantoxicity.^(5,8) However, as described herein, the production of NAA 10by direct synthesis has not been achieved, and the biosyntheticproduction of this compound in quantities that would be useful at even asmall scale commercial level would be difficult to achieve, if at all.

Accordingly, the present invention generally relates to a series ofisolated genes from the fungus, H. pulicicidum, which combined, form agene cluster that mediates the production of NAs, and to the use of thatgene cluster to direct the heterologous expression of NAs in an isolatedhost cell, preferably an isolated fungal cell. Using a recentlydeveloped technique for manipulating gene sequences termed the ModularIdempotent DNA Assembly System (MIDAS), the inventors have reconstitutedthe biosynthetic pathway for NAF 5a from H. pulicicidum in an alternatefungal host, Penicillium paxilli. The MIDAS platform and method of usingthe MIDAS platform are described herein, and related patent applicationAU2017903955, the entirety of which is incorporated herein by reference.

The inventors analyzed the genomic sequence of H. pulicicidum andidentified a cluster comprising 15 predicted coding sequences (CDSs):nodW cDNA (SEQ ID NO:2) and genomic DNA (SEQ ID NO:1), nodR cDNA (SEQ IDNO:5) and genomic DNA (SEQ ID NO:4), nodX cDNA (SEQ ID NO:8) and genomicDNA (SEQ ID NO:7), nodM cDNA (SEQ ID NO:11) and genomic DNA (SEQ IDNO:10), nodB cDNA (SEQ ID NO:14) and genomic DNA (SEQ ID NO:13), nodOcDNA (SEQ ID NO:17) and genomic DNA (SEQ ID NO:16), nodJ cDNA (SEQ IDNO:20) and genomic DNA (SEQ ID NO:19), nodC cDNA (SEQ ID NO:23) andgenomic DNA (SEQ ID NO:22), nodY1 cDNA (SEQ ID NO:26) and genomic DNA(SEQ ID NO:25), nodD2 cDNA (SEQ ID NO:29) and genomic DNA (SEQ IDNO:28), nodD1 cDNA (SEQ ID NO:32) and genomic DNA (SEQ ID NO:31), nodY2cDNA (SEQ ID NO:35) and genomic DNA (SEQ ID NO:34), nodZ cDNA (SEQ IDNO:38) and genomic DNA (SEQ ID NO:37), nodS cDNA (SEQ ID NO:49) andgenomic DNA (SEQ ID NO:48), nodI cDNA (SEQ ID NO:55) and genomic DNA(SEQ ID NO:54) that are expected to encode enzymes necessary for thebiosynthesis of NAA 10.

The boundaries of this cluster were determined by identifying flankinggenes that have high similarity and syntenic organisation compared withan equivalent genomic locus in another Hypoxylon strain that does notproduce nodulisporic acids. Details of these predicted genes in thecluster and their proposed function are shown in Table 1. Seven of thecluster genes are homologous to those found in other IDT biosyntheticgene clusters (Tables 2-6). The protein product of the seven predictedgenes that are homologous to IDT biosynthesis genes from other fungihave at least 35% amino acid identity to their homologues in the PAXcluster of P. the JAN cluster of P. janthinellum, and/or the PEN clusterof P. crustosum and include a GGT (NodC (SEQ ID NO:24)), twoFAD-dependent oxidases (NodM (SEQ ID NO:12) and NodO (SEQ ID NO:18)), anIDT cyclase (NodB (SEQ ID NO:15)), two prenyl transferases (NodD2 (SEQID NO:30), and NodD1 (SEQ ID NO:33)), and one cytochrome P450 oxygenase(NodR (SEQ ID NO:6)). The other seven putative ORFs were predicted toencode four cytochrome P450 oxygenases (NodW (SEQ ID NO:3), NodX (SEQ IDNO:9), NodJ (SEQ ID NO:21), and NodZ (SEQ ID NO:39)), a pair ofparalogous FAD-dependent oxygenases (NodY1 (SEQ ID NO:27), and NodY2(SEQ ID NO:36)), and two gene products that may be involved in NAbiosynthesis with unknown functions (NodS (SEQ ID NO:50), and NodI (SEQID NO:56)). Similar to the TER gene cluster from Chaunopycnis alba(Tolypocladium album) responsible for terpendole biosynthesis,¹⁷ the NODcluster does not appear to contain a secondary metabolite-specific GGPPSgene. Notably, the inventors identified only one GGPPS-encoding gene inthe genome of H. pulicicidum and the amino acid sequence of itspredicted protein product, its exon/intron structure, and its locationoutside of the identified cluster strongly suggest that it isresponsible for primary metabolic function similar to ggs1 in P.paxilli. ²³

To confirm the function of gene products and directly establish theirrespective roles in NAA 10 biosynthesis the inventors constructed aseries of plasmids harbouring various combinations of these genes, whichthey then transformed into in appropriate P. paxilli hosts (Table 7) forheterologous production of NAA 10 precursors. Accordingly, CDSs of theH. pulicicidum genes of interest were amplified (see Table 8 forprimers) and cloned into a MIDAS Level-1 destination vector, pML1 (Table9). At MIDAS Level-2, the cloned CDSs were placed under the control ofheterologous promoter (ProUTR) and transcriptional terminator (UTRterm)modules to generate full-length TUs (Table 10), which were then used togenerate the multi-gene plasmids (Table 11). The inventors used arepertoire of P. paxilli knockout strains (Table 7) to carry outfunctional complementations and pathway reconstitution to determine thefunctions of genes in the NOD cluster. Following transformation of P.paxilli hosts with multi-gene plasmids, the inventors determined thechemical phenotypes of the transformants initially by normal-phasethin-layer chromatography (TLC, results not shown) and subsequently byreversed-phase liquid chromatography-mass spectrometry (LC-MS) analysisof fungal extracts. The inventors purified the newly expressedmetabolites, as determined by high resolution mass spectrometry (HRMS),by semi-preparative reversed-phase high-performance liquidchromatorgraphy (HPLC) and subjected compounds to nuclear magneticresonance (NMR) spectorscopic analysis (¹H, ¹³C, and HSQC, HMBC, COSY)for final identification.

Using this methodology, the inventors identified that nodC (cDNA (SEQ IDNO:23) and genomic DNA (SEQ ID NO:22)) is a functional ortholog of paxC(cDNA (SEQ ID NO:43) and genomic DNA (SEQ ID NO:42)), and that NodC (SEQID NO:24) mediates the production of GGI 2, the second step in IDTbiosynthesis, in H. pulicicidum (FIG. 3 , trace i.b, FIG. 4 ). NodC (SEQID NO:24) shares 52.3% amino acid sequence identity with PaxC (SEQ IDNO:44) from P. paxilli (Table 2). The inventors also identified thatnodM (cDNA (SEQ ID NO:11) and genomic DNA (SEQ ID NO:10)), is a homologof paxM (cDNA (SEQ ID NO:46) and genomic DNA (SEQ ID NO:45)), and thatNodM (SEQ ID NO:12) is a GGI 2 mono-epoxidase that catalyzes theproduction of monoepoxidized-GGI 3a (FIG. 3 , trace ii.b, FIG. 5 ). NodM(SEQ ID NO:12) shares 48.6% sequence identity with PaxM (SEQ ID NO:47)from P. paxilli (Table 3). Notably, the inventors showed that NodM (SEQID NO:12) is specifically a mono-epoxidase, unlike PaxM (SEQ ID NO:47),which is a mono- or di-epoxidase. The inventors further confirmed thatNodB (SEQ ID NO:15) from H. pulicicidum acts as the IDT cyclase thatcyclizes the monoepoxidized-GGI product 3a to form emindole SB 4a (FIG.3 , trace iii.b, FIG. 6 ) and that nodB (cDNA (SEQ ID NO:14) and genomicDNA (SEQ ID NO:13)) is a functional ortholog of paxB (cDNA (SEQ IDNO:52) and genomic DNA (SEQ ID NO:51)) from P. paxilli (FIG. 3 , traceiv.b, FIG. 7 ). NodB (SEQ ID NO:15) from H. pulicicidum shares 63%identity with PaxB (SEQ ID NO:53) from P. paxilli (Table 4).

The dedicated NAA 10 core is NAF 5a, generated by oxidation of theterminal methyl carbon, C5″, of emindole SB 4a (FIG. 8B). Accordinglythe inventors confirmed the identity of this oxidase as a P450 oxygenaseencoded by H. pulicicidum nodW (cDNA (SEQ ID NO:2) and genomic DNA (SEQID NO:1)) by co-expression of nodW (cDNA (SEQ ID NO:2) and genomic DNA(SEQ ID NO:1)) with nodM (cDNA (SEQ ID NO:11) and genomic DNA (SEQ IDNO:10)) into a paxM (cDNA (SEQ ID NO:46) and genomic DNA (SEQ ID NO:45))deletion background (PN2257) resulting in the production of NAF 5a (FIG.3 , trace v.b, FIG. 9 ).

To confirm that only five genes are essential for the production of NAF5a in P. paxilli, and to establish that no other P. paxilli IDT genesfrom the PAX cluster in the paxM KO strain (PN2257) were contributing toNAF 5a production, the inventors assembled a multigene constructcomprising paxG (genomic DNA (SEQ ID NO:40)) from P. paxilli, and nodC(genomic DNA (SEQ ID NO:22)), nodM (genomic DNA (SEQ ID NO:10)), nodB(genomic DNA (SEQ ID NO:13)), and nodW (genomic DNA (SEQ ID NO:1)) fromH. pulicicidum. This multigene construct was transformed into a P.paxilli PAX gene cluster knockout strain (PN2250). As expected,expression of the five genes showed that NAF 5a was produced (FIG. 10 )and indicated that these five genes are indeed required to biosynthesizeNAF 5a.

Based on the work described herein the inventors disclose the use ofheterologous expression to identify the first five steps that deliverthe NAA 10 core compound, NAF 5a. In particular, the inventors haveconfirmed the function of four previously unknown genes from H.pulicicidum: nodC (cDNA (SEQ ID NO:23) and genomic DNA (SEQ ID NO:22)),nodM (cDNA (SEQ ID NO:11) and genomic DNA (SEQ ID NO:10)), nodB (cDNA(SEQ ID NO:14) and genomic DNA (SEQ ID NO:13)), and nodW (cDNA (SEQ IDNO:2) and genomic DNA (SEQ ID NO:1)), and discovered a secondfilamentous fungal species, H. pulicicidum, that does not appear to havea secondary metabolic GGPPS gene but can still produce IDTs. Withoutwishing to be bound by theory, the inventors believe that H. pulicicidumrelies upon its primary metabolic GGPPSs to provide the GGPP for IDTsynthesis. The lack of a secondary-metabolic GGPPS may explain why H.pulicicidum produces such low quantities of NAs. The low quantities ofNAs produced by H. pulicicidum is a challenge for both resolving thebiosynthetic details and for usage of the compounds or theirderivatives. Using the efficient gene reassembly of MIDAS andheterologous expression in P. paxilli the inventors have overcome boththese issues. Furthermore, the inventors demonstrated that P. paxilli,with its far more favourable growth conditions, is a suitable host forheterologous expression studies, which enabled the inventors to confirmthe function of genes more quickly and easily than would have beenpossible had they relied on the biosynthetic machinery of H.pulicicidum.

Elucidation of the biosynthetic routes for heterologous production ofNAF 5a in P. paxilli provides a reasonable expectation of success inbeing able to fully identify the gene products from H. pulicicidum thatare responsible for the ‘decoration’ steps that lead to the productionof fully functionalized NAA 10. This reasonable expectation comes fromthe identification, by the inventors, of the nucleic acid CDS s that arepredicted to encode the enzymes required for prenylation to formnodulisporic acid E 6a, and for oxidations, to form nodulisporic acid D7a. Each of these nucleic acid sequences has been putatively identifiedfrom the H. pulicicidum biosynthetic gene cluster described herein.Overall, the inventors work described herein confirms that heterologousexpression of IDT genes in a heterologous host is a viable method thatcan be employed to produce natural products that are difficult to obtainin other ways, and that could be useful across many industries.

Polypeptides

Accordingly, in one aspect the invention relates to an isolatedpolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO: NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX(SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ IDNO:18), NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27),NodD2 (SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ(SEQ ID NO:39), NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or afunctional variant or fragment thereof.

Preferably the functional variant or fragment thereof comprises at least70%, preferably at least 75%, preferably at least 80%, preferably atleast 85%, preferably at least 90%, preferably at least 95%, preferablyat least 99% amino acid sequence identity to an amino acid sequenceselected from the group consisting of SEQ ID NO: NodW (SEQ ID NO:3),NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21), NodC (SEQ IDNO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1 (SEQ IDNO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ ID NO:50),and NodI (SEQ ID NO:56).

Preferably the isolated polypeptide comprising NodW (SEQ ID NO:3) or afunctional variant or fragment thereof has oxygenase activity,preferably cytochrome P450 oxygenase activity.

Preferably the isolated polypeptide comprising NodR (SEQ ID NO:6) or afunctional variant or fragment thereof has oxygenase activity,preferably cytochrome P450 oxygenase activity.

Preferably the isolated polypeptide comprising NodX (SEQ ID NO:9) or afunctional variant or fragment thereof has oxygenase activity,preferably cytochrome P450 oxygenase activity.

Preferably the isolated polypeptide comprising NodM (SEQ ID NO:12) or afunctional variant or fragment thereof has oxygenase activity,preferably FAD-dependent oxygenase activity.

Preferably the isolated polypeptide comprising NodB (SEQ ID NO:15) or afunctional variant or fragment thereof has cyclase activity, preferablyIDT cyclase activity.

Preferably the isolated polypeptide comprising NodO (SEQ ID NO:18) or afunctional variant or fragment thereof has oxygenase activity,preferably FAD-dependent oxygenase activity.

Preferably the isolated polypeptide comprising NodJ (SEQ ID NO:21) or afunctional variant or fragment thereof has oxygenase activity,preferably cytochrome P450 oxygenase activity.

Preferably the isolated polypeptide comprising NodC (SEQ ID NO:24) or afunctional variant or fragment thereof has transferase activity,preferably GGT activity.

Preferably the isolated polypeptide comprising NodY1 (SEQ ID NO:27) or afunctional variant or fragment thereof has oxygenase activity,preferably FAD-dependent oxygenase activity.

Preferably the isolated polypeptide comprising NodD2 (SEQ ID NO:30) or afunctional variant or fragment thereof has transferase activity,preferably prenyl transferase activity.

Preferably the isolated polypeptide comprising NodD1 (SEQ ID NO:33) or afunctional variant or fragment thereof has transferase activity,preferably prenyl transferase activity.

Preferably the isolated polypeptide comprising NodY2 (SEQ ID NO:36) or afunctional variant or fragment thereof has oxygenase activity,preferably FAD-dependent oxygenase activity.

Preferably the isolated polypeptide comprising NodZ (SEQ ID NO:39) or afunctional variant or fragment thereof has oxygenase activity,preferably cytochrome P450 oxygenase activity.

In one embodiment the isolated polypeptide comprises SEQ ID NO: NodW(SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ IDNO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21),NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1(SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ IDNO:50), and NodI (SEQ ID NO:56) or a functional variant or fragmentthereof.

In one embodiment the isolated polypeptide consists essentially of SEQID NO: NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM(SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ IDNO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30),NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS(SEQ ID NO:50), and NodI (SEQ ID NO:56) or a functional variant orfragment thereof.

In one embodiment the isolated polypeptide consists of SEQ ID NO: NodW(SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ IDNO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21),NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1(SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ IDNO:50), and NodI (SEQ ID NO:56) or a functional variant or fragmentthereof.

Polynucleotides

In another aspect the invention relates to an isolated polynucleotideencoding a polypeptide comprising an amino acid sequence selected fromthe group consisting of SEQ ID NO: NodW (SEQ ID NO:3), NodR (SEQ IDNO:6), NodX (SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15),NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1(SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ ID NO:50), and NodI (SEQ IDNO:56) or a functional variant or fragment thereof.

Preferably the functional variant or fragment thereof comprises at least70%, preferably at least 75%, preferably at least 80%, preferably atleast 85%, preferably at least 90%, preferably at least 95%, preferablyat least 99% amino acid sequence identity to an amino acid sequenceselected from the group consisting of SEQ ID NO: NodW (SEQ ID NO:3),NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21), NodC (SEQ IDNO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1 (SEQ IDNO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ ID NO:50),and NodI (SEQ ID NO:56) a functional variant or fragment thereof.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodW (SEQ ID NO:3) or a functional variant or fragment thereof havingoxygenase activity, preferably cytochrome P450 oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodR (SEQ ID NO:6) or a functional variant or fragment thereof havingoxygenase activity, preferably cytochrome P450 oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodX (SEQ ID NO:9) or a functional variant or fragment thereof havingoxygenase activity, preferably cytochrome P450 oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodM (SEQ ID NO:12) or a functional variant or fragment thereof havingoxygenase activity, preferably FAD-dependent oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodB (SEQ ID NO:15) or a functional variant or fragment thereof havingcyclase activity, preferably IDT cyclase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodO (SEQ ID NO:18) or a functional variant or fragment thereof havingoxygenase activity, preferably FAD-dependent oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodJ (SEQ ID NO:21) or a functional variant or fragment thereof havingoxygenase activity, preferably cytochrome P450 oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodC (SEQ ID NO:24) or a functional variant or fragment thereof havingtransferase activity, preferably GGT activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodY1 (SEQ ID NO:27) or a functional variant or fragment thereof havingoxygenase activity, preferably FAD-dependent oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodD2 (SEQ ID NO:30) or a functional variant or fragment thereof havingtransferase activity, preferably prenyl transferase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodD1 (SEQ ID NO:33) or a functional variant or fragment thereof havingtransferase activity, preferably prenyl transferase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodY2 (SEQ ID NO:36) or a functional variant or fragment thereof havingoxygenase activity, preferably FAD-dependent oxygenase activity.

Preferably the isolated polynucleotide encodes a polypeptide comprisingNodZ (SEQ ID NO:39) or a functional variant or fragment thereof havingoxygenase activity, preferably cytochrome P450 oxygenase activity.

In one embodiment the isolated polynucleotide encodes a polypeptidecomprising NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9),NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18), NodJ (SEQID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ IDNO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39),NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or a functional variant orfragment thereof.

In one embodiment the isolated polynucleotide encodes a polypeptideconsisting essentially of NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX(SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ IDNO:18), NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27),NodD2 (SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ(SEQ ID NO:39), NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or afunctional variant or fragment thereof.

In one embodiment the isolated polynucleotide encodes a polypeptideconsisting of NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ IDNO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18),NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2(SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ IDNO:39), NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or a functionalvariant or fragment thereof.

In another aspect the invention relates to an isolated polynucleotidecomprising at least 70% nucleic acid sequence identity to a nucleic acidsequence selected from the group consisting of nodW cDNA (SEQ ID NO:2),nodW genomic DNA (SEQ ID NO:1), nodR cDNA (SEQ ID NO:5), nodR genomicDNA (SEQ ID NO:4), nodX cDNA (SEQ ID NO:8), nodX genomic DNA (SEQ IDNO:7), nodM cDNA (SEQ ID NO:11), nodM genomic DNA (SEQ ID NO:10), nodBcDNA (SEQ ID NO:14), nodB genomic DNA (SEQ ID NO:13), nodO cDNA (SEQ IDNO:17), nodO genomic DNA (SEQ ID NO:16), nodJ cDNA (SEQ ID NO:20), nodJgenomic DNA (SEQ ID NO:19), nodC cDNA (SEQ ID NO:23), nodC genomic DNA(SEQ ID NO:22), nodY1 cDNA (SEQ ID NO:26), nodY1 genomic DNA (SEQ IDNO:25), nodD2 cDNA (SEQ ID NO:29), nodD2 genomic DNA (SEQ ID NO:28),nodD1 cDNA (SEQ ID NO:32), nodD1 genomic DNA (SEQ ID NO:31), nodY2 cDNA(SEQ ID NO:35), nodY2 genomic DNA (SEQ ID NO:34), nodZ cDNA (SEQ IDNO:38), nodZ genomic DNA (SEQ ID NO:37), nodS cDNA (SEQ ID NO:49), nodSgenomic DNA (SEQ ID NO:48), nodI cDNA (SEQ ID NO:55), and nodI genomicDNA (SEQ ID NO:54).

Preferably the isolated polynucleotide comprises at least 75%,preferably at least 80%, preferably at least 85%, preferably at least90%, preferably at least 95%, preferably at least 99% nucleic acidsequence identity to a nucleic acid sequence selected from the groupconsisting of SEQ ID NO: nodW cDNA (SEQ ID NO:2), nodW genomic DNA (SEQID NO:1), nodR cDNA (SEQ ID NO:5), nodR genomic DNA (SEQ ID NO:4), nodXcDNA (SEQ ID NO:8), nodX genomic DNA (SEQ ID NO:7), nodM cDNA (SEQ IDNO:11), nodM genomic DNA (SEQ ID NO:10), nodB cDNA (SEQ ID NO:14), nodBgenomic DNA (SEQ ID NO:13), nodO cDNA (SEQ ID NO:17), nodO genomic DNA(SEQ ID NO:16), nodJ cDNA (SEQ ID NO:20), nodJ genomic DNA (SEQ IDNO:19), nodC cDNA (SEQ ID NO:23), nodC genomic DNA (SEQ ID NO:22), nodY1cDNA (SEQ ID NO:26), nodY1 genomic DNA (SEQ ID NO:25), nodD2 cDNA (SEQID NO:29), nodD2 genomic DNA (SEQ ID NO:28), nodD1 cDNA (SEQ ID NO:32),nodD1 genomic DNA (SEQ ID NO:31), nodY2 cDNA (SEQ ID NO:35), nodY2genomic DNA (SEQ ID NO:34), nodZ cDNA (SEQ ID NO:38), nodZ genomic DNA(SEQ ID NO:37), nodS cDNA (SEQ ID NO:49), nodS genomic DNA (SEQ IDNO:48), nodI cDNA (SEQ ID NO:55), and nodI genomic DNA (SEQ ID NO:54).

In one embodiment the isolated polynucleotide comprises a nucleic acidsequence selected from the group consisting of nodW cDNA (SEQ ID NO:2),nodW genomic DNA (SEQ ID NO:1), nodR cDNA (SEQ ID NO:5), nodR genomicDNA (SEQ ID NO:4), nodX cDNA (SEQ ID NO:8), nodX genomic DNA (SEQ IDNO:7), nodM cDNA (SEQ ID NO:11), nodM genomic DNA (SEQ ID NO:10), nodBcDNA (SEQ ID NO:14), nodB genomic DNA (SEQ ID NO:13), nodO cDNA (SEQ IDNO:17), nodO genomic DNA (SEQ ID NO:16), nodJ cDNA (SEQ ID NO:20), nodJgenomic DNA (SEQ ID NO:19), nodC cDNA (SEQ ID NO:23), nodC genomic DNA(SEQ ID NO:22), nodY1 cDNA (SEQ ID NO:26), nodY1 genomic DNA (SEQ IDNO:25), nodD2 cDNA (SEQ ID NO:29), nodD2 genomic DNA (SEQ ID NO:28),nodD1 cDNA (SEQ ID NO:32), nodD1 genomic DNA (SEQ ID NO:31), nodY2 cDNA(SEQ ID NO:35), nodY2 genomic DNA (SEQ ID NO:34), nodZ cDNA (SEQ IDNO:38), nodZ genomic DNA (SEQ ID NO:37), nodS cDNA (SEQ ID NO:49), nodSgenomic DNA (SEQ ID NO:48), nodI cDNA (SEQ ID NO:55), and nodI genomicDNA (SEQ ID NO:54).

In one embodiment the isolated polynucleotide consists essentially of anucleic acid sequence selected from the group consisting of nodW cDNA(SEQ ID NO:2), nodW genomic DNA (SEQ ID NO:1), nodR cDNA (SEQ ID NO:5),nodR genomic DNA (SEQ ID NO:4), nodX cDNA (SEQ ID NO:8), nodX genomicDNA (SEQ ID NO:7), nodM cDNA (SEQ ID NO:11), nodM genomic DNA (SEQ IDNO:10), nodB cDNA (SEQ ID NO:14), nodB genomic DNA (SEQ ID NO:13), nodOcDNA (SEQ ID NO:17), nodO genomic DNA (SEQ ID NO:16), nodJ cDNA (SEQ IDNO:20), nodJ genomic DNA (SEQ ID NO:19), nodC cDNA (SEQ ID NO:23), nodCgenomic DNA (SEQ ID NO:22), nodY1 cDNA (SEQ ID NO:26), nodY1 genomic DNA(SEQ ID NO:25), nodD2 cDNA (SEQ ID NO:29), nodD2 genomic DNA (SEQ IDNO:28), nodD1 cDNA (SEQ ID NO:32), nodD1 genomic DNA (SEQ ID NO:31),nodY2 cDNA (SEQ ID NO:35), nodY2 genomic DNA (SEQ ID NO:34), nodZ cDNA(SEQ ID NO:38), nodZ genomic DNA (SEQ ID NO:37), nodS cDNA (SEQ IDNO:49), nodS genomic DNA (SEQ ID NO:48), nodI cDNA (SEQ ID NO:55), andnodI genomic DNA (SEQ ID NO:54).

In one embodiment the isolated polynucleotide consists of a nucleic acidsequence selected from the group consisting of nodW cDNA (SEQ ID NO:2),nodW genomic DNA (SEQ ID NO:1), nodR cDNA (SEQ ID NO:5), nodR genomicDNA (SEQ ID NO:4), nodX cDNA (SEQ ID NO:8), nodX genomic DNA (SEQ IDNO:7), nodM cDNA (SEQ ID NO:11), nodM genomic DNA (SEQ ID NO:10), nodBcDNA (SEQ ID NO:14), nodB genomic DNA (SEQ ID NO:13), nodO cDNA (SEQ IDNO:17), nodO genomic DNA (SEQ ID NO:16), nodJ cDNA (SEQ ID NO:20), nodJgenomic DNA (SEQ ID NO:19), nodC cDNA (SEQ ID NO:23), nodC genomic DNA(SEQ ID NO:22), nodY1 cDNA (SEQ ID NO:26), nodY1 genomic DNA (SEQ IDNO:25), nodD2 cDNA (SEQ ID NO:29), nodD2 genomic DNA (SEQ ID NO:28),nodD1 cDNA (SEQ ID NO:32), nodD1 genomic DNA (SEQ ID NO:31), nodY2 cDNA(SEQ ID NO:35), nodY2 genomic DNA (SEQ ID NO:34), nodZ cDNA (SEQ IDNO:38), nodZ genomic DNA (SEQ ID NO:37), nodS cDNA (SEQ ID NO:49), nodSgenomic DNA (SEQ ID NO:48), nodI cDNA (SEQ ID NO:55), and nodI genomicDNA (SEQ ID NO:54).

The nucleic acid molecules of the invention or otherwise describedherein are preferably isolated. They can be isolated from a biologicalsample using a variety of techniques known to those of ordinary skill inthe art. By way of example, such polynucleotides can be isolated throughuse of the polymerase chain reaction (PCR) as known in the art. Thenucleic acid molecules of the invention can be amplified using primers,as defined herein, derived from the polynucleotide sequences of theinvention.

Further methods for isolating polynucleotides include use of all, orportions of, a polynucleotide of the invention as hybridization probes.The technique of hybridizing labeled polynucleotide probes topolynucleotides immobilized on solid supports such as nitrocellulosefilters or nylon membranes, can be used to screen genomic or cDNAlibraries. Similarly, probes may be coupled to beads and hybridized tothe target sequence. Isolation can be effected using known art protocolssuch as magnetic separation. The choice of appropriately stringenthybridization and wash conditions is believed to be within the skill ofthose in the art.

Polynucleotide fragments may be produced by techniques well-known in theart such as restriction endonuclease digestion and oligonucleotidesynthesis.

A partial polynucleotide sequence may be used as a probe, in methodswell-known in the art to identify the corresponding full lengthpolynucleotide sequence in a sample. Such methods include PCR-basedmethods, 5′RACE and hybridization-based method, computer/database—basedmethods as known in the art. Detectable labels such as radioisotopes,fluorescent, chemiluminescent and bioluminescent labels may be used tofacilitate detection. Inverse PCR also permits acquisition of unknownsequences, flanking the polynucleotide sequences disclosed herein,starting with primers based on a known region as known and used in theart. The method uses several restriction enzymes to generate a suitablefragment in the known region of a gene. The fragment is thencircularized by intramolecular ligation and used as a PCR template.Divergent primers are designed from the known region. In order tophysically assemble full-length clones, standard molecular biologyapproaches can be utilized as known in the art. Primers and primer pairswhich allow amplification of polynucleotides of the invention, also forma further aspect of this invention.

Variants (including orthologues) may be identified by the methodsdescribed. Variant polynucleotides may be identified using PCR-basedmethods as known in the art. Typically, the polynucleotide sequence of aprimer, useful to amplify variants of polynucleotide molecules by PCR,may be based on a sequence encoding a conserved region of thecorresponding amino acid sequence.

Further methods for identifying variant polynucleotides include use ofall, or portions of the specified polynucleotides as hybridizationprobes to screen genomic or cDNA libraries as described above. Typicallyprobes based on a sequence encoding a conserved region of thecorresponding amino acid sequence may be used. Hybridization conditionsmay also be less stringent than those used when screening for sequencesidentical to the probe.

In another aspect the invention relates to a TU comprising at least oneisolated polynucleotide as described herein. In one embodiment the TU iscomprised in vector, preferably an expression vector. In one embodimentthe vector is selected from the group consisting of plasmids, BACs,(PACs), YACs, bacteriophage, phagemids, and cosmids. Preferably thevector is a plasmid.

In another aspect the invention relates to a vector that encodes anisolated polypeptide or functional variant or fragment thereof accordingto the invention.

In another aspect the invention relates to a vector comprising anisolated nucleic acid sequence according to the invention.

In one embodiment the isolated nucleic acid sequence is comprised in aTU.

In one embodiment the vector is selected from the group consisting ofplasmids, BACs, PACs, YACs, bacteriophage, phagemids, and cosmids.Preferably the vector is a plasmid. In one embodiment the vector is anexpression vector.

A TU comprising a polynucleotide of the invention can be incorporatedinto any suitable vector capable of expressing that polynucleotide or,where applicable, an encoded polypeptide of the invention in vitro or ina host cell. Preferably the vector is an expression vector. Examples ofsuitable expression vectors include, but not limited to, plasmid DNAvectors, viral DNA vectors (such as adenovirus and adeno-associatedvirus), or viral RNA vectors (such as a retroviral vectors). In someembodiments the plasmid and/or phage vectors may be selected from thefollowing vectors or variants thereof including pUC18, pU19, Mp18, Mp19,ColE1, PCR1 and pKRC; lambda gt10 and M13 plasmids such as pBR322,pACYC184, pT127, RP4, p1J101, SV40 and BPV. Also included are vectorssuch as, but not limited to, cosmids, YACS, BACs shuttle vectors such aspSA3, PAT28 transposons (such as described in U.S. Pat. No. 5,792,294)and the like.

Suitable viral vectors include, but are not limited to vectors derivedfrom adenovirus (AV); adeno-associated virus (AAV); retroviruses (e.g.,lentiviruses (LV), Rhabdoviruses, murine leukemia virus); herpes virus,and the like. Viral vectors employed herein can be appropriatelymodified by pseudotyping with envelope proteins or other surfaceantigens from other viruses, or by substituting different viral capsidproteins, as known and used in the art.

In one embodiment the expression vector comprises at least one,preferably at least two, preferably at least three, preferably at leastfour, preferably at least five, preferably at least six, preferably atleast seven, preferably at least eight, preferably at least nine,preferably at least 10 isolated polynucleotides as described herein.

In one embodiment the expression vector comprises at least one,preferably at least two, preferably at least three, preferably at leastfour, preferably at least five, preferably at least six, preferably atleast seven, preferably at least eight, preferably at least nine,preferably at least 10 TUs as described herein.

In one embodiment the vector is a component in a cloning system. In oneembodiment the cloning system is useful for making a gene constructcomprising at least one TU.

In one embodiment the vector is comprised in a vector set, the vectorset being part of a cloning system. In one embodiment the cloning systemis useful for making a gene construct comprising at least one TU.

In one embodiment the cloning system is useful for making a geneconstruct comprising at least one TU.

In one embodiment the gene construct is a multigene construct comprisingat least two TUs. In one embodiment the multigene construct comprises atleast three, preferably at least four, preferably at least five,preferably at least six, preferably at least seven, preferably at leasteight, preferably at least nine, preferably at least ten TUs.

The TUs described herein may comprise one or more of the disclosedpolynucleotide sequences and/or polynucleotides encoding the disclosedpolypeptides, of the invention. The TU can constructed to driveexpression of at least one polypeptide involved in the biosynthesis ofNAA10, either in vitro or in vivo. In one embodiment, the TU comprises apolynucleotide of the invention operatively linked to 5′ or 3′untranslated regulatory sequences. The design of a particular TU willdepend on various factors including the host cells in which theoperatively linked polynucleotide is to be expressed and the desiredlevel of polynucleotide expression.

Likewise, the selection of various promoters, enhancers and/or othergenetic elements for a TU will depend on various factors including thehost cells and expression levels discussed above. In one embodiment, theTU comprises a homologous promoter operatively linked to apolynucleotide of the invention. In another embodiment, the expressioncassette comprises a heterologous promoter operatively linked to apolynucleotide of the invention. In one embodiment, the homologous orheterologous promoter is an inducible, repressible or regulatablepromoter. A suitable promoter may be chosen and used under theappropriate conditions to direct high-level expression of apolynucleotide of the invention. Many such elements are described in theliterature and are available through commercial suppliers.

By way of example only, promoters useful in the expression cassettes canbe any suitable eukaryotic or prokaryotic promoter. In one embodiment,the eukaryotic promoter can be a eukaryotic RNA polymerase I (pol I),RNA polymerase II (pol II), or RNA polymerase III (pol III). Expressionlevels of an operably linked polynucleotide in a particular cell typewill be determined by the nearby presence (or absence) of specific generegulatory sequences (e.g., enhancers, silencers and the like). Anysuitable promoter/enhancer combination (see: Eukaryotic Promoter DataBase EPDB) can be used to drive expression of a polynucleotide of theinvention.

Additional promoters useful in expression cassettes include β-lactamase,alkaline phosphatase, tryptophan, and tac promoter systems which are allwell known in the art. Yeast promoters include 3-phosphoglyceratekinase, enolase, hexokinase, pyruvate decarboxylase, glucokinase, andglyceraldehydrate-3-phosphanate dehydrogenase but are not limitedthereto.

Prokaryotic promoters useful in expression cassettes includeconstitutive promoters as known in the art (such as the int promoter ofbacteriophage lamda and the bla promoter of the beta-lactamase genesequence of pBR322) and regulatable promoters (such as lacZ, recA andgal). A ribosome binding site upstream of the CDS may also be requiredfor expression.

Enhancers useful in a TU include SV40 enhancer, cytomegalovirus earlypromoter enhancer, globin, albumin, insulin and the like.

In one embodiment, a TU may be driven by a T3, T7 or SP6 cytoplasmicexpression system.

The choice of a particular promoter/enhancer/cell type combination forprotein expression is within the ordinary skill of those in the art ofmolecular biology (see, for example, Sambrook et al. (1989) which isincorporated herein by reference).

In another aspect the invention relates to an isolated host cellcomprising an isolated polypeptide, isolated polynucleotide, TU and/orisolated vector according to the invention.

In one embodiment the isolated host cell is a prokaryotic or eukaryoticcell. Prokaryotes most commonly employed as host cells are strains ofEscherichia coli (E. coli). Other prokaryotic hosts include Pseudomonas,Bacillus, Serratia, Klebsiella, Streptomyces, Listeria, Salmonella andMycobacteria but are not limited thereto.

In one embodiment the eukaryotic cell is an animal cell, a plant cell, afungal cell or a protist cell. In one embodiment the animal cell is aninsect cell or a mammalian cell. In one embodiment the fungal cell is asingle cell of a unicellular fungal host strain. In one embodiment thefungal cell comprises fungal hyphae or the mycelia of a fungal hoststrain.

In one embodiment the fungal cell, hyphae or mycelia of the fungal hoststrain are from the genus Aspergillus, Trichoderma, Neurospora,Fusarium, Mortierella, Chrysosporium, Candida, Geotrichum, Yarrowia,Eremothecium, Trichoplusia, Ashbya, Hansenula, Pichia, Kluveromyces,Schizzosaccharomyces, Monascus, Talaromyces, Cryptonectria, Endothia,Tolypocladium, Hypocrea, Gibberella, Acremonium, Agaricus, Pleurotus,Penicillium, Volvariella, Flammulina, Lentinula, Auricularia, Ganoderma,(Rhizo)mucor, Riopus, or Saccharomyces, preferably Penicillium,Aspergillus, Saccharomyces, Pichia, Tricoplusia, and Spondoptera.Preferably the fungal cell is from Saccharomyces. Preferably the fungalhyphae or mycelia is from Penicillium, preferably P. paxilli.

In another aspect the invention relates to a method of making at leastone NA comprising heterologously expressing at least one polypeptide,isolated nucleic acid sequence, TU or vector according to the inventionin an isolated host cell.

In one embodiment the NA is selected from the group of NAs depicted inFIG. 1 . Preferably the NA is NAF 5a or NAA 10, preferably NAA 10.

In one embodiment the polypeptide is a polypeptide or functional variantor fragment according to the invention.

Specifically contemplated as embodiments within this aspect of theinvention are various embodiments set out herein with regards to anyother aspect of the invention that relate to heterologous expression(including choice of appropriate regulatory sequences), expressioncassettes, genetic elements, TUs, multigene constructs, host cells, andvectors.

In a particular embodiment, heterologous expression of the polypeptidecomprises expression of at least one polynucleotide according to theinvention or at least one TU encoding at least one polypeptide of theinvention, from at least one vector as described herein in an isolatedfungal host cell or in the mycelia of an isolated fungal strain asdescribed herein. In one embodiment the polypeptide is NodR (SEQ IDNO:6) NodX (SEQ ID NO:9), or NodZ (SEQ ID NO:39), preferably thepolypeptide is an enzyme that catalyzes a biological transformation fromNAB 9 to NAA 10. In one embodiment the fungal cell or strain is a cellor strain of Penicillium, preferably P. paxilli.

In one embodiment the TU is comprised in a multigene constructcomprising at least two, at least three, at least four, at least five,at least six, at least seven, at least eight, at least nine and/or atleast 10 polynucleotides encoding polypeptides according to theinvention.

In another aspect the invention relates to at least one NA made by amethod of the invention. In one embodiment the NA is selected from thegroup of NAs depicted in FIG. 1 . Preferably the NA is NAF 5a or NAA 10,preferably NAA 10.

In another aspect the present invention relates to an isolatedpolypeptide or functional variant or fragment thereof from Hypoxylonspp. that catalyzes a biochemical reaction in the biosynthetic pathwayleading from GGI 2 to NAA 10.

In another aspect the present invention relates to an isolatedpolynucleotide encoding at least one polypeptide from Hypoxylon spp.that catalyzes a biochemical reaction in the biosynthetic pathwayleading from GGI 2 to NAA 10.

In one embodiment the isolated polypeptide is an oxygenase, preferably acytochrome P450 oxygenase or a FAD-dependent oxygenase. Preferably thecytochrome P450 oxygenase is NodW (SEQ ID NO:3), NodR (SEQ ID NO:6),NodX (SEQ ID NO:9), NodJ (SEQ ID NO:21), or NodZ (SEQ ID NO:39).Preferably the FAD dependent oxygenase is NodM (SEQ ID NO:12), NodO (SEQID NO:18), NodY1 (SEQ ID NO:27), or NodY2 (SEQ ID NO:36). In oneembodiment the isolated polypeptide is a transferase, preferably a GGT,or a prenyl transferase. Preferably the GGT is NodC (SEQ ID NO:24).Preferably the prenyl transferases are NodD1 (SEQ ID NO:33), or NodD2(SEQ ID NO:30). In one embodiment the isolated polypeptide is a IDTcyclase. Preferably the IDT cyclase is NodB (SEQ ID NO:15). In oneembodiment the isolated polypeptide is NodS (SEQ ID NO:50). In oneembodiment the isolated polypeptide is NodI (SEQ ID NO:56).

In one embodiment the isolated polypeptide catalyzes a biochemicalreaction in the biosynthetic pathway leading from GGI 2 to NAF 5a.Preferably the isolated polypeptide is a GGT, a FAD-dependent oxygenase,an IDT cyclase, or a cytochrome P450 oxygenase. Preferably the GGT isNodC (SEQ ID NO:24). Preferably the FAD-dependent oxygenase is NodM (SEQID NO:12). Preferably the IDT cyclase is NodB (SEQ ID NO:15). Preferablythe cytochrome P450 oxygenase is NodW (SEQ ID NO:3).

In one embodiment the isolated polypeptide or functional variant orfragment thereof is encoded by a nucleic acid according to theinvention.

In another aspect the invention relates to a method of making at leastone Hypoxylon spp. polypeptide or functional variant or fragment thereofcomprising heterologously expressing an isolated nucleic acid sequence,TU or vector according to the invention in an isolated host cell.

In one embodiment the at least one Hypoxylon spp. polypeptide is apolypeptide according to the invention as contemplated herein for anyother aspect of the invention.

In one embodiment the at least one Hypoxylon spp. polypeptide is apolypeptide comprising an amino acid sequence of SEQ ID NO: NodW (SEQ IDNO:3) or a functional variant or fragment thereof. Preferably thepolypeptide consists essentially or consists of SEQ ID NO: NodW (SEQ IDNO:3). In one embodiment the isolated host cell comprises fungal myceliaof the genus Penicillium, preferably P. paxilli.

Specifically contemplated for this aspect of the invention are variousembodiments set out for any other aspect of the invention that relate tothe heterologous expression (including choice of appropriate regulatorysequences), genetic elements, TUs, multigene constructs, host cells, andvectors.

In another aspect the invention relates to a method of making at leastone NA comprising heterologously expressing in an isolated host cell, atleast one polypeptide that catalyzes a biochemical reaction in thebiosynthetic pathway leading from GGI 2 to NAA 10.

In one embodiment the at least one polypeptide is an oxygenase,preferably a cytochrome P450 oxygenase or a FAD-dependent oxygenase.Preferably the cytochrome P450 oxygenase is NodW (SEQ ID NO:3), NodR(SEQ ID NO:6), NodX (SEQ ID NO:9), NodJ (SEQ ID NO:21), or NodZ (SEQ IDNO:39). Preferably the FAD dependent oxygenase is NodM (SEQ ID NO:12),NodO (SEQ ID NO:18), NodY1 (SEQ ID NO:27), or NodY2 (SEQ ID NO:36). Inone embodiment the isolated polypeptide is a transferase, preferably aGGT, or a prenyl transferase. Preferably the GGT is NodC (SEQ ID NO:24).Preferably the prenyl transferases are NodD1 (SEQ ID NO:33), or NodD2(SEQ ID NO:30). In one embodiment the isolated polypeptide is a IDTcyclase. Preferably the IDT cyclase is NodB (SEQ ID NO:15). In oneembodiment the isolated polypeptide is NodS (SEQ ID NO:50). In oneembodiment the isolated polypeptide is NodI (SEQ ID NO:56).

In one embodiment the at least one polypeptide catalyzes a biochemicalreaction in the biosynthetic pathway leading from GGI 2 to NAF 5a.Preferably at least one polypeptide is a GGT, a FAD-dependent oxygenase,an IDT cyclase, or a cytochrome P450 oxygenase. Preferably the GGT isNodC (SEQ ID NO:24). Preferably the FAD-dependent oxygenase is NodM (SEQID NO:12). Preferably the IDT cyclase is NodB (SEQ ID NO:15). Preferablythe cytochrome P450 oxygenase is NodW (SEQ ID NO:3).

In one embodiment the least one polypeptide comprises the amino acidsequence of SEQ ID NO: NodW (SEQ ID NO:3) or a functional variant orfragment thereof. Preferably the polypeptide consists essentially orconsists of SEQ ID NO: NodW (SEQ ID NO:3). In one embodiment theisolated host cell comprises fungal mycelia of the genus Penicillium,preferably P. paxilli.

Specifically contemplated for this aspect of the invention are variousembodiments set out for any other aspect of the invention that relate tothe heterologous expression (including choice of appropriate regulatorysequences), genetic elements, TUs, multigene constructs, host cells, andvectors.

In another aspect the invention relates to an isolated host cell thatexpresses at least one heterologous polypeptide that catalyzes thetransformation of a substrate in the biosynthetic pathway leading to theformation of NAA 10.

In one embodiment at least one heterologous polypeptide catalyzes thetransformation of a substrate in the biosynthetic pathway leading to theformation of NAF 5a.

In one embodiment the substrate is selected from the group consisting ofGGPP 1a, indole-3-glycerol phosphate 1b, GGI 2, mono-epoxidized GGI 3a,emindole SB 4a, NAF 5a, NAE 6a, NAD 7a, NAC 8, and NAB 9.

In one embodiment the transformation is selected from the groupconsisting of a condensation, an oxidation, or a cyclization.

In one embodiment the substrates that are transformed are GGPP 1a andindole-3-glycerol phosphate 1b, and the transformation is acondensation.

In one embodiment the substrate that is transformed is GGI 2 and thetransformation is an oxidation.

In one embodiment the substrate that is transformed is mono-epoxidizedGGI 3a and the transformation is a cyclization.

In one embodiment the substrate that is transformed is emindole SB 4aand the transformation is an oxidation.

In one embodiment the substrate that is transformed is NAF 5a and thetransformation is a condensation.

In one embodiment the substrate that is transformed is NAE 6a and thetransformation is an oxidation.

In one embodiment the substrate that is transformed is NAD 7a and thetransformations are an oxidation and a condensation.

In one embodiment the substrate that is transformed is NAC 8 and thetransformation is an oxidation.

In one embodiment the substrate that is transformed is NAB 9 and thetransformation is an oxidation.

In another aspect the invention relates to an isolated host cell thatproduces, by heterologous expression, at least one polypeptide involvedin the biosynthetic pathway leading to NAA 10.

In one embodiment the at least one polypeptide catalyzes a biochemicalreaction in the biosynthetic pathway leading from GGI 2 to NAF 5a.Preferably at least one polypeptide is a GGT, a FAD-dependent oxygenase,an IDT cyclase, or a cytochrome P450 oxygenase. Preferably the GGT isNodC (SEQ ID NO:24). Preferably the FAD-dependent oxygenase is NodM (SEQID NO:12). Preferably the IDT cyclase is NodB (SEQ ID NO:15). Preferablythe cytochrome P450 oxygenase is NodW (SEQ ID NO:3).

In some embodiments specifically contemplated for this aspect of theinvention, the at least one polypeptide is a polypeptide involved in thebiosynthetic pathway leading to NAA 10 as defined herein for any otheraspect of the invention.

In one embodiment at least one polypeptide is a polypeptide orfunctional variant or fragment thereof of the invention. In oneembodiment the polypeptide or functional variant or fragment thereof isencoded by a nucleic acid sequence of the invention.

In one embodiment the at least one polypeptide is involved in thebiosynthetic pathway leading to NAF 5a. In one embodiment the least onepolypeptide comprises the amino acid sequence of SEQ ID NO: NodW (SEQ IDNO:3) or a functional variant or fragment thereof. Preferably thepolypeptide consists essentially or consists of SEQ ID NO: NodW (SEQ IDNO:3). In one embodiment the isolated host cell comprises fungal myceliaof the genus Penicillium, preferably P. paxilli.

Specifically contemplated for this aspect of the invention are variousembodiments set out for any other aspect of the invention that relate tothe heterologous expression (including choice of appropriate regulatorysequences), genetic elements, TUs, multigene constructs, host cells, andvectors.

In another aspect the invention relates to a method of producing atleast one NA comprising contacting a carbohydrate comprising substratewith a recombinant cell transformed with a nucleic acid that results inan increased level or activity of a polypeptide selected from the groupconsisting of NodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ IDNO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18),NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2(SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ IDNO:39), NodS (SEQ ID NO:50), and NodI (SEQ ID NO:56) or a functionalvariant or fragment thereof compared to the cell prior totransformation, such that the substrate is metabolized to at least oneNA.

In one embodiment the nucleic acid encodes at least one polypeptide thatcatalyzes a biochemical reaction in the biosynthetic pathway leadingfrom GGI 2 to NAF 5a, preferably that catalyzes the biochemical reactionthat leads from emindole SB 4a to NAF 5a.

In one embodiment the recombinant host cell is an isolated host cell ofthe invention as described herein.

In one embodiment the carbohydrate is comprised in a culture media. Inone embodiment the culture media is CDYE or a variation thereof thatsupports the growth of the recombinant cell.

In one embodiment the nucleic acid encodes least one polypeptide that isan oxygenase, preferably a cytochrome P450 oxygenase or a FAD-dependentoxygenase. Preferably the cytochrome P450 oxygenase is NodW (SEQ IDNO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodJ (SEQ ID NO:21), orNodZ (SEQ ID NO:39). Preferably the FAD dependent oxygenase is NodM (SEQID NO:12), NodO (SEQ ID NO:18), NodY1 (SEQ ID NO:27), or NodY2 (SEQ IDNO:36). In one embodiment the isolated polypeptide is a transferase,preferably a GGT, or a prenyl transferase. Preferably the GGT is NodC(SEQ ID NO:24). Preferably the prenyl transferases are NodD1 (SEQ IDNO:33), or NodD2 (SEQ ID NO:30). In one embodiment the isolatedpolypeptide is a IDT cyclase. Preferably the IDT cyclase is NodB (SEQ IDNO:15). In one embodiment the isolated polypeptide is NodS (SEQ IDNO:50). In one embodiment the isolated polypeptide is NodI (SEQ IDNO:56).

In one embodiment the nucleic acid encodes at least one GGT,FAD-dependent oxygenase, IDT cyclase, or cytochrome P450 oxygenase. Inone embodiment the nucleic acid codes at least two, preferably at leastthree, preferably all four of the GGT, FAD-dependent oxygenase, IDTcyclase, or cytochrome P450 oxygenase. Preferably the GGT is NodC (SEQID NO:24). Preferably the FAD-dependent oxygenase is NodM (SEQ IDNO:12). Preferably the IDT cyclase is NodB (SEQ ID NO:15). Preferablythe cytochrome P450 oxygenase is NodW (SEQ ID NO:3).

In one embodiment a polypeptide selected from the group consisting ofNodW (SEQ ID NO:3), NodR (SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ IDNO:12), NodB (SEQ ID NO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21),NodC (SEQ ID NO:24), NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1(SEQ ID NO:33), NodY2 (SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ IDNO:50), and NodI (SEQ ID NO:56) or a functional variant or fragmentthereof comprises the amino acid sequence of a NodW (SEQ ID NO:3), NodR(SEQ ID NO:6), NodX (SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQ IDNO:15), NodO (SEQ ID NO:18), NodJ (SEQ ID NO:21), NodC (SEQ ID NO:24),NodY1 (SEQ ID NO:27), NodD2 (SEQ ID NO:30), NodD1 (SEQ ID NO:33), NodY2(SEQ ID NO:36), NodZ (SEQ ID NO:39), NodS (SEQ ID NO:50), and NodI (SEQID NO:56) or functional variant or fragment thereof of the invention.

In one embodiment the polypeptide comprises the amino acid sequence ofNodW (SEQ ID NO:3) or a functional variant or fragment thereof.Preferably the polypeptide consists essentially or consists of SEQ IDNO: NodW (SEQ ID NO:3). In one embodiment the isolated host cellcomprises fungal mycelia of the genus Penicillium, preferably P.paxilli.

Specifically contemplated for this aspect of the invention are variousembodiments set out for any other aspect of the invention that relate tothe heterologous expression (including choice of appropriate regulatorysequences), genetic elements, TUs, multigene constructs, host cells, andvectors.

In one embodiment the at least one heterologous or introduced homologousnucleic acid sequence is at least one NAA 10 biosynthetic gene selectedfrom the group consisting of nodW, nodR, nodX, nodM, nodB, nodO, nodJ,nodC, nodY1, nodD2, nodD1, nodY2, nodZ nodS, and nodI as describedherein.

In one embodiment one of the two different GGPPS enzymes is produced inH. pulicicidum by heterologous expression.

In one embodiment one of the two different GGPPS enzymes is encoded by asecond copy of a native H. pulicicidum gene that encodes a GGPPS enzyme.

In another aspect the invention relates to an isolated strain ofHypoxylon pulicicidum that comprises a genetic modification that leadsto an increased biosynthesis of NAA 10.

In one embodiment the isolated strain comprises increased expression ofat least one GGPPS enzyme as compared to a control strain of H.pulicicidum, preferably H. pulicicidum ATCC 74245.

In one embodiment the increased expression is increased expression ofthe native primary GGPPS gene of H. pulicicidum via modification ofgenetic regulatory elements.

In one embodiment modification of genetic regulatory elements comprisesoperatively linking the native primary GGPPS gene to an alternative ormodified promoter.

In one embodiment modification of genetic regulatory elements comprisesoperatively linking a native primary GGPPS gene to a more robust nativepromoter. In one embodiment the native primary GGPPS gene is anintroduced homologous gene.

In one embodiment the increased expression is the result of heterologousexpression of biosynthetic genes that contribute to NAA 10 biosynthesis.

In one embodiment the increased expression is due to expression ofheterologous genes in H. pulicicidum that have equivalent biochemicalfunction to genes identified in the Nod cluster, wherein the Nod clusteris as described herein.

In one embodiment the increased expression is due to expression ofheterologous genes in H. pulicicidum to remediate limitations in thesupply of substrate compounds or biosynthetic intermediates that arenecessary for NAA 10 biosynthesis.

In one embodiment the increased expression is due to heterologousexpression in H. pulicicidum of any gene that encodes a GGT thatcatalyzes the condensation of GGPP 1a and indole-3-glycerol phosphate 1bto produce 3-geranylgeranyl indole 2.

In one embodiment the increased expression is due to heterologousexpression in H. pulicicidum of any gene that encodes a GGPPS.

In one embodiment the increased expression is due to heterologousexpression in H. pulicicidum of any gene that encodes a FAD-dependentoxidase that creates the single epoxidized-GGI product 3a.

In one embodiment the increased expression is due to heterologousexpression in H. pulicicidum of any gene that encodes an enzyme thatcyclises the single epoxidized-GGI 3a to produce emindole SB 4a.

In one embodiment the increased expression is due to heterologousexpression in H. pulicicidum of any gene that encodes an oxidase thatoxidises emindole SB 4a to produce a nodulisporic acid.

In one embodiment the increased expression is due to at least onegenetic modification that leads to the increased expression of a NAbiosynthetic gene selected from the group consisting of nodal nodR,nodX, nodM, nodB, nodO, nodJ, nodC, nodY1, nodD2, nodD1, nodY2, nodZ,nodS and nodI as described herein.

In another aspect the invention relates to a method of making NAA 10comprising expressing at least one heterologous nucleic acid sequence inHypoxylon pulicicidum, wherein the at least one heterologous nucleicacid sequence encodes an enzyme in a biosynthetic pathway leading to NAA10.

Specifically contemplated for this aspect of the invention are variousembodiments set out for any other aspect of the invention that relate tothe isolated strains of Hypoxylon pulicicidum as described hereinincluding as relates to increased expression of NAA 10, and alsoincluding all embodiments set out regarding heterologous expression(including choice of appropriate regulatory sequences), geneticelements, TUs, multigene constructs, host cells, and vectors asdescribed herein.

In this specification where reference has been made to patentspecifications, other external documents, or other sources ofinformation, this is generally for the purpose of providing a contextfor discussing the features of the invention. Unless specifically statedotherwise, reference to such external documents is not to be construedas an admission that such documents; or such sources of information, inany jurisdiction, are prior art, or form part of the common generalknowledge in the art.

The invention will now be illustrated in a non-limiting way by referenceto the following examples.

EXAMPLES

Materials and Methods

gDNA Isolation for Genome Sequencing and TUM Amplification

Genomic DNA for genome sequencing and TUM amplification by PCR wasisolated from Penicillium paxilli strain ATCC® 26601™ (PN2013)²⁴ andHypoxylon pulicicidum strain ATCC® 74245™,²⁵ according to Byrd et al.²⁶with modifications. Sterile 2.4% (w/v) Difco™ potato dextrose broth(Becton, Dickinson and Company, Maryland, U.S.A.) in Milli-Q® water wasprepared in 25 mL aliquots in 125 mL Erlenmeyer flasks and inoculatedwith 5×10⁶ spores or ˜1 cm² freshly ground mycelia (for non-sporulatingstrains). Cultures were incubated for 2-4 days at 22° C. with shaking(200 rpm). The fermentation broth was filtered through a sterile nappyliner and the mycelia were rinsed three times with sterile water.Mycelia was transferred to a sterile 15 mL centrifuge tube and flashfrozen in liquid nitrogen for lyophilization for 24-48 hours. 15-20 mgfreeze dried mycelia was placed in a mortar with liquid nitrogen andground into a powder. The ground mycelia was transferred into a 2 mLtube and resuspended in 1 mL extraction buffer (150 mM EDTA, 50 mMTris-HCl, and 1% (w/v) sodium lauroyl sarcosine). 1.6 mg proteinase Kwas added to the tube and contents were incubated at 37° C. for 30 min.The tube was centrifuged at 13,000 rpm for 10 min and the supernatantwas transferred to a fresh 2 mL tube. 500 μL phenol and 500 μL,chloroform were added to the tube and the contents were mixed by vortexbefore centrifugation for 10 min at 13,000 rpm. The aqueous phase wastransferred to a fresh 2 mL tube and washed two more times with 500 μL,phenol and 500 μL chloroform as previously described. The aqueous phasewas then transferred to a fresh 2 mL tube and washed (vortex andcentrifuge at 13,000 rpm for 10 min) with 1 mL chloroform. The aqueousphase was transferred to a fresh 2 mL tube and mixed with 1 mL chilledisopropanol. The DNA was precipitated overnight at −20° C. and pelletedat 13,000 rpm for 10 min. The supernatant was discarded and the DNA wasresuspended in 1 mL 1 M NaCl. The tube was incubated for 10 min at roomtemperature and then centrifuged at 13,000 rpm for 10 min to pelletpolysaccharides. The supernatant was transferred to a fresh tube andmixed with 1 mL isopropanol. The tube was incubated at room temperaturefor 10 min and DNA was pelleted by centrifugation at 13,000 rpm for 10min. The supernatant was discarded and 1 mL chilled 70% ethanol wasadded to the pellet without resuspension. The tube was centrifuged for 2min at 13,000 rpm and the supernatant was discarded. The tube wascentrifuged for 1 min at 13,000 rpm and residual 70% ethanol was pipetteoff. The pellet was air dried at room temperature, resuspended in 50 μLMilli-Q® water and stored at −20° C.

MIDAS Design Overview

The MIDAS toolkit is based on the Golden Gate assembly technique,²⁷which utilises the ability of Type IIS restriction enzymes to seamlesslyjoin multiple DNA fragments together in a single reaction. MIDAS makesuse of three Type IIS restriction enzymes, AarI, BsaI and BsmBI, whichgenerate user-defined 4 bp overhangs upon cleavage. Through theappropriate choice of these user-defined overhangs, and the appropriateorientation of the Type IIS sites flanking each of the DNA fragments,multiple fragments can be assembled into a recipient plasmid (alsocalled a destination vector) in an ordered (directional) fashion using aone-pot restriction-ligation reaction. The recipient plasmid contains amarker gene (typically the lacZα gene for blue/white screening) flankedby two divergently oriented recognition sites for a Type IIS enzyme;these elements, collectively called the ‘Golden Gate cloning cassette’,are replaced by the insert during the assembly reaction.

As with other recently described Golden Gate-based modular assemblytechniques,²⁸⁻³² assembly of genes and multigene constructs using MIDASis a hierarchical process. At the first level (MIDAS Level-1),functional modules (promoters, CDS, terminators, tags, etc.) are clonedinto the Level-1 destination vector (pML1), where they form libraries ofreusable, sequence-verified parts. The complementary design of themodules and destination vector ensures that, once cloned into pML1,these modules can be released from the vector by digestion with BsaI.

At the second level (Level-2), compatible sets of the sequence-verifiedLevel-1 modules are released from pML1 and assembled into a Level-2destination vector (pML2) using a BsaI-mediated Golden Gate reaction,leading to creation of a Level-2 plasmid containing a eukaryotic TU.Once again, the design rules ensure that each assembled TU can bereleased from the pML2 vector—this time by digestion either with AarI orBsmBI (depending on the pML2 vector in which the TU was assembled).

At Level-3, the TUs that were assembled at Level-2 are released from thepML2 plasmids and are sequentially assembled together in a Level-3destination vector (pML3), using either AarI- or BsmBI-mediated GoldenGate reactions, to form functional multigene constructs, which can thenbe transformed into the desired expression host.

Level-1: Module Cloning

At Level-1, functional TUMs are generated, either as a PCR product, oras a synthetic polynucleotide sequence (from a gene synthesis company),and are cloned into the Level-1 destination vector (pML1) byBsmBI-mediated Golden Gate cloning. In order to be cloned into the pML1vector, the PCR primers are designed so that each amplified TUM isflanked by two convergent BsmBI sites, BsmBI[CTCG] and [AGAC]BsmBI,which upon restriction enzyme cleavage, generate sticky ends that arecompatible with those of the BsmBI sites present in the pML1 destinationvector. Thus, the Golden Gate cloning cassette present in pML1 consistsof two divergent BsmBI sites flanking a lacZα scoreable marker:5′-[CTCG]BsmBI-lacZα-BsmBI[AGAC]-3′ (FIG. 11A).

To enable subsequent (i.e., Level-2) assembly of full-length TUs, eachTUM is designed to be flanked by four module-specific nucleotides (NNNN)at the 5′ end, and four module-specific nucleotides (NNNN) at the 3′end, which are included as part of the PCR primer sequences. Thecomplementary design of the amplified modules and the pML1 vectorensures that, when amplified TUMs are cloned into pML1 using theBsmBI-mediated Golden Gate reaction, each TUM becomes flanked byconvergent BsaI recognition sites, and the module-specific nucleotides(NNNN and NNNN) become the BsaI-specific 4 bp overhangs when the moduleis released from pML1 during the subsequent (i.e., Level-2)BsaI-mediated Golden Gate assembly of the full-length TU. Thus, theoverall structure of each module in the PCR product (or syntheticpolynucleotide) takes the form:5′-BsmBI[CTCG]NNNN-TUM-NNNNtg[AGAC]BsmBI-3′ (FIG. 11B), which becomes5′-BsaI[NNNN]-TUM-[NNNN]BsaI-3′ in pML1, following BsmBI-mediatedcloning (FIG. 11C).

As each TUM is defined by its flanking four nucleotides, thesemodule-specific bases effectively form an address system for each TUMand they determine its position and orientation within the assembled TU.The developers of MoClo and GoldenBraid2.0 have already worked inconcert to develop a common syntax or set of standard addresses forplant expression (referred to as ‘fusion sites’ in the MoClo system and‘barcodes’ in GoldenBraid2.0) for a wide variety of TUMs to facilitatepart exchangeability,³³ and this standard is also adopted here forMIDAS-based assembly of TUs for expression in filamentous fungi (FIG. 12).

Thus, for filamentous fungal expression, a ProUTR module (comprising apromoter, 5′ untranslated region (UTR) and ATG initiation codon) wouldhave GGAG as the module-specific 5′ nucleotides, and AATG as themodule-specific 3′ nucleotides (i.e., 5′-GGAG-ProUTR-AATG-3′), with thetranslation initiation codon underlined. Similarly, a CDS module wouldbe flanked by AATG and GCTT (i.e., 5′-AATG-CDS-GCTT-3′), while a UTRtermmodule (consisting of a 3′UTR and a 3′ non-transcribed region, includingthe polyadenylation signal) would have the form 5′-GCTT-UTRterm-CGCT-3′.Considerations for the design of PCR primers for amplifying these threetypes of TUM are shown in Table 12.

Following the BsmBI-mediated assembly of TUMs in pML1, reactions aretransformed into an E. coli strain such as DH5α (or equivalent) andspread onto LB plates supplemented with spectinomycin, IPTG and X-Gal.Plasmids harbouring a cloned TUM are identified by screening whitecolonies and confirmed by sequencing.

At MIDAS Level-1, it is important that all internal recognition sitesfor AarI, BsaI and BsmBI are masked or eliminated from the TUMs. Theprocess of masking or removal of such forbidden sites—referred to as“domestication”—can be achieved by; (i) excluding these sites whenordering the sequences from a gene synthesis company, (ii) directedmutagenesis, or (iii) using masking oligonucleotides that form triplexeswith the target DNA, thereby preventing restriction enzyme cleavage.³⁰In the same way that Type IIS enzymes have previously been utilised formutagenesis³⁴ and for Golden Gate domestication purposes,^(27,35) wedomesticated MIDAS modules by designing PCR primers (referred to asdomestication primers) that overlap the internal Type IIS restrictionsite and which contain a single nucleotide mismatch that destroys thesite. Because the PCR products are designed to be assembled together inMIDAS using a BsmBI-mediated Golden Gate reaction to form thefull-length domesticated TUM in pML1, it is important that the MIDASdomestication primers be designed with BsmBI restriction sites thatgenerate compatible overhangs at their 5′ ends.

Level-2: TU Assembly

At Level-2, compatible sets of cloned and sequence-verified Level-1 TUMs(for example ProUTR, CDS and UTRterm modules) are assembled into a pML2destination vector using a BsaI-mediated Golden Gate reaction, leadingto creation of a Level-2 plasmid (pML2 entry clone) containing acomplete (i.e., full-length) eukaryotic TU. The module address standarddescribed earlier ensures that the assembly of a TU proceeds in anordered, directional fashion, with the 3′ end of one module beingcompatible with the 5′ end of the next module.

The module-specific bases GGAG, located at the 5′ end of ProUTR modules,and CGCT, at the 3′ end of UTRterm modules, are compatible with theoverhangs generated by BsaI digestion of the pML2 destination vectors,and these bases therefore define the outermost cloning boundaries of aLevel-2 assembly.

In MIDAS, there are eight Level-2 (pML2) destination vectors into whicha TU can be assembled, the choice of which depends on the desiredconfiguration of TUs in the multigene plasmid produced at Level-3,namely: (i) the desired order in which TUs are added to the multigeneassembly, (ii) the desired direction in which the multigene plasmid isassembled and (iii) the desired orientation of each TU in the multigeneplasmid. These features are discussed further below.

The pML2 vectors are distinguished from one another by the arrangementof specific sequence features that are central to the operation ofMIDAS. These sequence features, collectively called the MIDAS cassette(FIG. 13 ), define the Level-2 assembly of TUs and govern the assemblyof multigene constructs produced at Level-3.

Each MIDAS cassette is defined by (i) having a Golden Gate cloningcassette with flanking, divergent BsaI recognition sites, (ii) differingarrangements of recognition sites for AarI and BsmBI and (iii) thepresence or absence of a lacZα scoreable marker. These features aredescribed in greater detail.

In contrast to the usual Golden Gate cloning cassette (which typicallycontains a lacZα gene for blue/white screening), the Golden Gate cloningcassettes in all eight pML2 vectors contain a mutant E. coli pheS gene(driven by the promoter of the E. coli gene for chloramphenicolacetyltransferase) flanked by divergent BsaI recognition sites. TheThr²⁵¹Ala/Ala²⁹⁴Gly double mutant of the E. coli pheS gene used hereconfers high lethality to cells grown on LB media supplemented with thephenylalanine analogue 4-chloro-phenylalanine, 4CP.³⁶ DuringBsaI-mediated Level-2 assembly of TUs, the mutant pheS gene iseliminated from the pML2 vectors and can therefore be used as a negativeselection marker.

The eight pML2 vectors can be divided into two classes, “Blue” and“White”, depending on the presence or absence, respectively, of a lacZαgene in the MIDAS cassette (see FIG. 13 ). There are four “Blue” pML2vectors (indicated by the “B” in the plasmid name) and four “White” pML2vectors (indicated by the “W” in the plasmid name). The “Blue” and“White” vectors also differ in the relative configuration of the AarIand BsmBI restriction sites in their MIDAS cassettes. Thus, in the“Blue” vectors, the entire MIDAS cassette is flanked by convergent BsmBIsites and nested within is the lacZα gene flanked by divergent AarIsites. In the “White” vectors, the enzyme configuration is switched (theentire MIDAS cassette is flanked by convergent AarI sites and nestedwithin are two divergent BsmBI sites) and there is no lacZα gene. It isimportant to note that the lacZα chromogenic marker in the pML2 vectorsis not used for blue/white screening during the Level-2 Golden Gateassembly of TUs (it is reserved for the Level-3 cloning), but the choiceof “Blue” or “White” vector into which a TU should be assembled must bemade during Level-2 assembly of TUs as this will determine the order inwhich that TU is added to the multigene construct at Level-3. Likewise,the AarI and BsmBI sites are also not used for Level-2 assembly of TUs;instead they are integral to the Level-3 assembly of multigeneconstructs. These considerations, including the differences between the(+) and (—) vectors, are discussed further below, under the Level-3description.

The orientation (direction of transcription) of each TU can be freelydefined by assembling each TU in either a pML2 “Forward” vector(indicated by “F” in the plasmid name) or a pML2 “Reverse” vector(indicated by “R” in the plasmid name). The pML2 “Reverse” vectors havetheir BsaI recognition sites (for Golden Gate assembly of TUs) switchedrelative to the BsaI fusion sites in the pML2 “Forward” vectors. Thus,pML2 “Forward” vectors have their pheS-based Golden Gate cassetteoriented 5′[GGAG]BsaI-pheS

-BsaI[CGCT]-3′, while the pML2 “Reverse” vectors have their BsaIrecognition sites switched: 5′[AGCG]BsaI-

pheS-BsaI[CTCC]-3′, where the arrowhead indicates the direction oftranscription of the mutant pheS gene.

In contrast to the cloned Level-1 modules, the pML2 destination vectorsconfer kanamycin resistance, allowing efficient counter selectionagainst Level-1 module backbones, while the mutant pheS gene providespowerful negative selection against any parental pML2 destinationplasmids when E. coli DH5a cells (or equivalent) transformed with theassembly reactions are spread onto LB plates supplemented with kanamycinand 4CP.

Level-3: Assembly of Multigene Constructs

At MIDAS Level-3, TUs that were assembled in the pML2 plasmids aresequentially loaded (by binary assembly) into the Level-3 destinationvector (pML3) to form the multigene construct.

Assembly of multigene constructs at Level-3 is crucially dependent onthe relative configuration of the AarI and BsmBI restriction sites inthe MIDAS cassettes located in the “Blue” and “White” pML2 vectors; thenested and inverted configuration of these restriction sites in theWhite vectors compared to the Blue vectors is a defining feature of theMIDAS multigene assembly process. In the “Blue” vectors, the entireMIDAS cassette has flanking convergent BsmBI sites and nested within isa lacZα gene flanked by divergent AarI sites. In the “White” vectors,the enzyme configuration is inverted (the entire MIDAS cassette hasflanking convergent AarI sites and nested within are two divergent BsmBIsites) and there is no lacZα gene. As illustrated in FIG. 14 , thenesting and inversion of the restriction sites in the “Blue” and “White”vectors mean that TUs assembled into “White” MIDAS cassettes can beinserted into “Blue” MIDAS cassettes using AarI-mediated Golden Gatereactions and, conversely, TUs assembled into “Blue” MIDAS cassettes canbe cloned into “White” MIDAS cassettes using BsmBI-mediated Golden Gatereactions. This cycle of cloning (i.e., alternating between “White” and“Blue” pML2 entry clones) can be repeated indefinitely.

The Golden Gate cloning cassette found in the Level-3 destinationvector, pML3, consists of a lacZα gene flanked by divergent AarI sites:[CATT]AarI-lacZα-AarI[CGTA], so the MIDAS Level-3 assembly is alwaysinitiated (i.e., the first TU is always added) using an AarI-mediatedGolden Gate reaction between pML3 and a TU that has been assembled intoa pML2 “White” destination vector (FIG. 14 ). The plasmid generated isthen used in a BsmBI-mediated Golden Gate reaction with a TU cloned intoa pML2 “Blue” destination vector. Further TUs are added by followingthis approach of alternating between AarI- and BsmBI-mediated GoldenGate reactions using pML2 “White” and pML2 “Blue” entry clones,respectively. Thus, each plasmid generated by cloning a TU into themultigene construct becomes the destination vector for the next cycle ofTU addition (FIG. 14 ).

Following each cloning cycle, E. coli DH5a cells (or equivalent) aretransformed with the Golden Gate reactions, spread onto LB platessupplemented with spectinomycin, IPTG and X-Gal, and positive clones areidentified by blue/white screening. Spectinomycin selects for cells thathave taken up the Level-3 plasmid and counter selects against any pML2plasmid backbones. Note that, whereas the lacZα chromogenic markerpresent in the pML2 “Blue” vectors was not previously utilised duringLevel-2 assembly of TUs, it is now, at the level of multigene assembly(Level-3), that it becomes used for blue/white screening. Thus, for TUsassembled into the multigene construct using AarI-mediated Golden Gatereactions, white colonies are picked for analysis, while TUs assembledinto the multigene construct using BsmBI-mediated Golden Gate reactionsare analysed by picking blue colonies (see Table 13).

In its simplest configuration, MIDAS can achieve multigene assemblyusing only two pML2 destination vectors: one “White” vector and one“Blue” vector (FIG. 15A). The full set of eight pML2 vectors areprovided to enable maximum user control over: (i) the order in whicheach TU is added to the growing multigene construct, (ii) the desiredorientation (that is, the direction of transcription) of each TU and(iii) the polarity of assembly, i.e., the direction in which incomingTUs are loaded into the multigene construct.

Firstly, and as described earlier, the order of addition of each TU tothe growing multigene construct is governed by the choice of “White” or“Blue” pML2 destination vector into which the TUs are assembled.

Secondly, and as described previously when discussing the Level-2features, the orientation (direction of transcription) of a TU can befreely defined by the choice of “Forward” or “Reverse” pML2 vector intowhich the TU is assembled. Extending MIDAS to include the option ofassembling TUs in either orientation expands the vector suite to fourpML2 plasmids (see FIG. 13 and FIG. 15B).

Thirdly, the polarity of multigene assembly (i.e., the direction inwhich new TUs are added to the growing multigene assembly) can also befreely defined—in this case by assembling TUs in pML2 destinationvectors of either “plus” (+) or “minus” (−) polarity (FIG. 13 ). The useof a pML2(+) entry clone for Level-3 assembly ensures that the TU addednext will be added in the same direction as the direction oftranscription of the Spec^(R) gene found in pML3, i.e. the TU assemblednext in the multigene construct will be added to the right of the TUthat was added using the pML2(+) entry clone (as illustrated in FIG. 15Aand FIG. 15B). In contrast, use of a pML2(−) entry clone for Level-3assembly forces the next TU to be added in the direction opposite tothat of the direction of transcription of the Spec^(R) gene found inpML3, so that the next TU loaded into the multigene construct will beadded to the left of the TU that was added using the pML2(−) entryclone. If, however, entry clones of both polarity (i.e., both pML2(+)and pML2(−) entry clones) are used to build the multigene construct,then this confers MIDAS with the ability to switch the direction inwhich new TUs are added to the Level-3 assembly, and for thehypothetical assembly shown in FIG. 15C all subsequently added TUs willbe nested between TU3 and TU2.

Bacterial and Fungal Strains

Routine growth of Escherichia coli was performed at 37° C. in LB broth.Chemically competent E. coli HST08 Stellar cells (Clontech Laboratories,Inc.) were used for routine transformations and maintenance of plasmids.Penicillium paxilli strains used in this study are shown in Table 7.

Protocols for MIDAS Level-1 Module Cloning

PCR-amplified modules were purified using spin-column protocols andcloned into the MIDAS Level-1 plasmid, pML1, by BsmBI-mediated GoldenGate assembly. Typically, 1-2 μL (approximately 50-200 ng) of pML1plasmid DNA from a miniprep was mixed with 1-2 μL of each purified PCRfragment, 1 μL of BsmBI (20 U/μL), 1 μL of T4 DNA Ligase (20 U/μL) and 2μL of 10× T4 DNA Ligase buffer in a total reaction volume of 20Reactions were incubated at 37° C. for 1 to 3 hours and an aliquot(typically 2-3 μL) was transformed into 30 μL of E. coli HST08 Stellarcompetent cells by heat shock. Following the recovery period (i.e.,addition of 250 μL SOC medium and incubation at 37° C. for 1 hour),aliquots of the transformation mix were spread onto LB agar platessupplemented with 50 μg/mL spectinomycin, 1 mM IPTG and 50 μg/mL X-Gal.Plates were incubated overnight at 37° C., and white colonies werechosen for analysis.

Protocols for MIDAS Level-2 TU Assembly

Using the modules cloned at Level-1, full-length TUs were assembled intoMIDAS Level-2 plasmids by BsaI-mediated Golden Gate assembly. Typically,40 fmol of pML2 plasmid DNA was mixed with 40 fmol of plasmid DNA ofeach Level-1 entry clone, 1 μL of BsaI-HF (20 U/μL), 1 μL of T4 DNALigase (20 U/μL) and 2 μL of 10×T4 DNA Ligase buffer in a total reactionvolume of 20 Reactions were incubated in a DNA Engine PTC-200 PeltierThermal Cycler (Bio-Rad) using the following parameters: 45 cycles of (2minutes at 37° C. and 5 minutes at 16° C.), followed by 5 minutes at 50°C. and 10 minutes at 80° C. Reactions were transformed as described forthe Level-1 assembly and spread onto LB agar plates containing 75 μg/mLkanamycin and 1.25 mM 4CP. Following overnight incubation at 37° C.,colonies were picked for analysis.

Protocols for MIDAS Level-3 Multigene Assembly

Full-length TUs assembled at Level-2, were used to create multigeneassemblies in the Level-3 destination vector by alternating Golden Gateassembly using either AarI (for TUs cloned into pML2 “White” vectors) orBsmBI (for TUs cloned into pML2 “Blue” vectors). Typically, 40 fmol ofLevel-3 destination vector plasmid DNA was mixed with 40 fmol of Level-2entry clone plasmid DNA, 1 μL of BsaI-HF (20 U/μL), 1 μL of T4 DNALigase (20 U/μL) and 2 μL of 10× T4 DNA Ligase buffer in a totalreaction volume of 20 μL. Reactions were incubated in a DNA EnginePTC-200 Peltier Thermal Cycler (Bio-Rad) using the following parameters:45 cycles of (2 minutes at 37° C. and 5 minutes at 16° C.), followed by5 minutes at 37° C. and 10 minutes at 80° C. Reactions were transformedas described for the Level-1 assembly and spread onto LB agar platessupplemented with 50 μg/mL spectinomycin, 1 mM IPTG and 50 μg/mL X-Gal.Plates were incubated overnight at 37° C. For AarI-mediated assemblyreactions, white colonies were chosen for analysis while, forBsmBI-mediated assembly reactions, blue colonies were selected.

Media and Reagents Used for Fungal Work.

CDYE (Czapex-Dox/Yeast extract) medium with trace elements was made withdeionized water and contained 3.34% (w/v) Czapex-Dox (Oxoid Ltd.,Hampshire, England), 0.5% (w/v) yeast extract (Oxoid Ltd., Hampshire,England), and 0.5% (v/v) trace element solution. For agar plates, Selectagar (Invitrogen, California, U.S.A.) was added to 1.5% (w/v).

Trace element solution was made in deionized water and contained 0.004%(w/v) cobalt(II) chloride hexahydrate (Ajax Finechem, Auckland, NewZealand), 0.005% (w/v) copper(II) sulfate pentahydrate (Scharlau,Barcelona, Spain), 0.05% (w/v) iron(II) sulfate heptahydrate (Merck,Darmstadt, Germany), 0.014% (w/v) manganese(II) sulfate tetrahydrate,and 0.05% (w/v) zinc sulfate heptahydrate (Merck, Darmstadt, Germany).The solution was preserved with 1 drop of 12 M hydrochloric acid.

Regeneration (RG) medium was made with deionized water and contained 2%(w/v) malt extract (Oxoid Ltd., Hampshire, England), 2% (w/v)D(+)-glucose anhydrous (VWR International BVBA, Leuven, Belgium), 1%(w/v) mycological peptone (Oxoid Ltd., Hampshire, England), and 27.6%sucrose (ECP Ltd. Birkenhead, Auckland, New Zealand). Depending onwhether the media was to be used for plates (1.5% RGA) or overlays (0.8%RGA), Select agar (Invitrogen, California, U.S.A.) was added to 1.5% or0.8% (w/v), respectively.

Fungal Protocols—Protoplast Preparation

The preparation of fungal protoplasts for transformation was accordingto Yelton et al.³⁷ with modifications. Five 25 mL aliquots of CDYEmedium with trace elements, in 100 mL Erlenmeyer flasks, were inoculatedwith 5×10⁶ spores and incubated for 28 hours at 28° C. with shaking (200rpm). The fermentation broth from all five flasks was filtered through asterile nappy liner and the combined mycelia were rinsed three timeswith sterile water and once with OM buffer (10 mM Na₂HPO₄ and 1.2 MMgSO₄·7H₂O, brought to pH 5.8 with 100 mM NaH₂PO₄·2H₂O). Mycelia wereweighed, resuspended in 10 mL of filter-sterilized Lysing Enzymessolution (prepared by resuspending Lysing Enzymes from Trichodermaharzianum (Sigma) at 10 mg/mL in OM buffer) per gram of mycelia, andincubated for 16 hours at 30° C. with shaking at 80 rpm. Protoplastswere filtered through a sterile nappy liner into a 250 mL Erlenmeyerflask. Aliquots (5 mL) of filtered protoplasts were transferred intosterile 15 mL centrifuge tubes and overlaid with 2 mL of ST buffer (0.6M sorbitol and 0.1 M Tris-HCl at pH 8.0). Tubes were centrifuged at2600×g for 15 minutes at 4° C. The white layer of protoplasts thatformed between the OM and ST buffers in each tube was transferred (in 2mL aliquots) into sterile 15 mL centrifuge tubes, gently washed bypipette resuspension in 5 mL of STC buffer (1 M sorbitol, 50 mM Tris-HClat pH 8.0, and 50 mM CaCl₂) and centrifuged at 2600×g for 5 minutes at4° C. The supernatant was decanted off and pelleted protoplasts frommultiple tubes were combined by resuspension in 5 mL aliquots of STCbuffer. The STC buffer wash was repeated three times until protoplastswere pooled into a single 15 mL centrifuge tube. The final protoplastspellet was resuspended in 500 μL of STC buffer and protoplastconcentration was estimated with a hemocytometer. The protoplast stockwas diluted to give a final concentration of 1.25×10⁸ protoplasts per mLof STC buffer. Aliquots of protoplasts (100 μL) were used immediatelyfor fungal transformations and excess protoplasts were preserved in 8%PEG solution (80 μL of protoplasts were added to 20 μL of 40% (w/v) PEG4000 in STC buffer) in 1.7 mL micro-centrifuge tubes and stored at −80°C.

Fungal Protocols—Transformation of P. paxilli

Fungal transformations—modified from Vollmer and Yanofsky³⁸ and Oliveret al.³⁹—were carried out in 1.7 mL micro-centrifuge tubes containing100 μL (1.25×10⁷) protoplasts, either freshly prepared in STC buffer, orstored in 8% PEG solution (as described above). A solution containing 2μL of spermidine (50 mM in H₂O), 5 μL heparin (5 mg/mL in STC buffer),and 5 μg of plasmid DNA (250 μg/mL) was added to the protoplasts and,following incubation on ice for 30 minutes, 900 μL of 40% PEG solution(40% (w/v) PEG 4000 in STC buffer) was added. The transformation mixturewas incubated on ice for a further 15-20 minutes, transferred to 17.5 mLof 0.8% RGA medium (prewarmed to 50° C.) in sterile 50 mL tubes, mixedby inversion, and 3.5 mL aliquots were dispensed onto 1.5% RGA plates.Following overnight incubation at 25° C., 5 mL of 0.8% RGA (containingsufficient geneticin to achieve a final concentration of 150 μg per mLof solid media) was overlaid onto each plate. Plates were incubated fora further 4 days at 25° C. and spores were picked from individualcolonies and streaked onto CDYE agar plates supplemented with 150 μg/mLgeneticin. Streaked plates were incubated at 25° C. for a further 4days. Spores from individual colonies were suspended in 50 μL of 0.01%(v/v) triton X-100 and 5×5 μL aliquots of the spore suspension wastransferred onto new CDYE agar plates supplemented with 150 μg/mLgeneticin. Sporulation plates were incubated at 25° C. for 4 days andspore stocks were prepared as follows. Colony plugs from the sporulationplates were suspended in 2 mL of 0.01% (v/v) triton X-100, and 800 μL ofsuspended spores were mixed with 200 μL of 50% (w/v) glycerol in a 1.7mL micro-centrifuge tube. Spore stocks were used to inoculate 50 mL ofCDYE media, flash frozen in liquid nitrogen and stored at −80° C.

Indole Diterpene Production and Extraction

Fungal transformants were grown in 50 mL of CDYE medium with traceelements for 7 days at 28° C. in shaker cultures (≥200 rpm), in 250 mLErlenmeyer flasks capped with cotton wool. Mycelia were isolated fromfermentation broths by filtration through nappy liners, transferred to50 mL centrifuge tubes (Lab Serv®, Thermo Fisher Scientific) and indolediterpenes were extracted by vigorously shaking the mycelia (≥200 rpm)in 2-butanone for ≥45 minutes.

Thin-Layer Chromatography

The 2-butanone supernatant (containing extracted indole diterpenes) wasused for thin-layer chromatography (TLC) analysis on solid phase silicagel 60 aluminium plates (Merck). Indole diterpenes were chromatographedwith 9:1 chloroform: acetonitrile or 8:2 dichloromethane: acetonitrileand visualized with Ehrlich's reagent (1% (w/v)p-dimethylaminobenzaldehyde in 24% (v/v) HCl and 50% ethanol).

Liquid Chromatography-Mass Spectrometry

Samples were prepared for liquid chromatography-mass spectrometry(LC-MS) from those transformants that tested positive by TLC.Accordingly, a 1 mL sample of the 2-butanone supernatant (containingextracted indole diterpenes) was transferred to a 1.7 mLmicro-centrifuge tube and the 2-butanone was evaporated overnight.Contents were resuspended in 100% acetonitrile and filtered through a0.2 μm membrane into an LC-MS vial. LC-MS samples were chromatographedon a reverse phase Thermo Scientific Accucore 2.6 μm C18 (50×2.1 mm)column attached to an UltiMate® 3000 Standard LC system (Dionex, ThermoFisher Scientific) run at a flow rate of 0.200 mL/minute and eluted withaqueous solutions of acetonitrile containing 0.01% formic acid using amultistep gradient method (Table 14). Mass spectra were captured throughin-line analysis on a maXis™ II quadrupole-time-of-flight massspectrometer (Bruker).

Large Scale Indole Diterpene Purification for NMR Analysis

Fungal transformants that produced high levels of novel indolediterpenes were grown in ≥1 litre of CDYE medium with trace elements, asdescribed under “Indole diterpene production and extraction”. Myceliawere pooled into 1 litre Schott bottles containing stir bars. 2-butanonewas added and indole diterpenes were extracted overnight with stirring(≥700 rpm). Extracts were filtered through Celite® 545 (J. T. Baker®)and dry loaded onto silica with rotary evaporation for crudepurification by silica column prior to a final purification bysemi-preparative HPLC. A 1 mL aliquot of crude extract was injected ontoa semi-preparative reversed phase Phenomenex 5 μm C18(2) 100 Å (250×15mm) column attached to an UltiMate® 3000 Standard LC system (Dionex,Thermo Fisher Scientific) run at a flow rate of 8.00 mL/minute.Multistep gradient methods were optimized for the purification ofdifferent sets of indole diterpenes. The purity of each indole diterpenewas assessed by LC-MS and the structure was identified by NMR.

NMR

NMR samples were prepared in deuterated chloroform. Compounds wereanalysed by standard one-dimensional proton and carbon-13 NMR,two-dimensional correlation spectroscopy (COSY), heteronuclear singlequantum correlation spectroscopy (HSQC), and heteronuclear multiple bondcorrelation spectroscopy (HMBC).

Tables 1-14 referenced in this specification are set out below:

TABLE 1 Functional assignment of predicted genes in the putativenodulisporic acid gene cluster. Size of Most notable BLASTp matchencoded E-value Protein name protein Predicted function % identity/ andaccession Gene (aa) [Specific Function] Organism % coverage number nodI1664 WD40 domain protein Hypoxylon sp. 0 OTA80149 CO27-5 36% ID/80% nodW608 Cytochrome P450 oxygenase Aspergillus 9.00E−153 XP_020058732[terminal-C dioxygenase] aculeatus 44% ID/97% nodR* 511 Cytochrome P450oxygenase Penicillium 3.00E−108 PtmU/BAU61563 simplicissimum 36% ID/97%nodX 593 Cytochrome P450 oxygenase Hypoxylon sp. 0 OTA78491 62% ID/70%nodM* 463 FAD-dependent oxygenase Aspergillus 5.00E−173 AtmD/Q672V4 [IDTmono-epoxidase] flavus 55% ID/93% nodB* 243 IDT cyclase Penicillium9.00E−119 PenB/AGZ20190 [IDT cyclase] crustosum 68% ID/99% nodO* 448FAD-dependent oxygenase Penicillium 2.00E−160 JanO/AGZ20488 janthinellum60% ID/97% nodJ 514 Cytochrome P450 oxygenase Aspergillus 3.00E−148XP_001270361 clavatus 42% ID/99% nodC* 326 Geranylgeranyl transferasePenicillium 2.00E−136 PenC/AGZ20189 [Geranylgeranyl transferase]crustosum 66% ID/83% nodY1 431 FAD-dependent oxygenase Penicillium2.00E−71  OxaD/AOC80388 oxalicum 34% ID/99% nodD2* 434 prenyltransferase Penicillium 1.00E−144 JanD/AGZ20478 janthinellum 48% ID/96%nodD1* 431 prenyl transferase Penicillium 1.00E−155 JanD/AGZ20478janthinellum 53% ID/94% nodY2 461 FAD-dependent oxygenase Aspergillus3.00E−105 AspB/P0DOW1 alliaceus 42% ID/98% nodZ 477 Cytochrome P450oxygenase Penicillium 7.00E−166 OQE14847 flavigenum 48% ID/96% nodS 535Not stated Hypoxylon sp. 9.00E−139 OTA93952 CO27-5 46% ID/94%

Naming of genes in IDT clusters has followed the A. nidulans namingconvention where genes are given a name with a with a three letterprefix in lower case that designates species, followed by a singleletter suffix in upper case that designates gene function written initalic font (e.g. paxC). Naming of the corresponding protein productfollows the same rules except that the initial letter of the prefix isupper case and the entire name is written in normal (non-italic) font(e.g. PaxC is the protein product of paxC). Thus, a nod name wasassigned to each H. pulicicidum gene in the NAA 10 gene cluster. H.pulicicidum genes that share homology (>35% amino acid identity ofpredicted translational products) with genes found in known IDT pathwaysare followed by an asterisk (*) and, with the exception of nodR, weregiven letters corresponding to known confirmed genes (e.g. the proteinencoded by nodC shares 52.8% amino acid identity with the proteinproduct of paxC). The genes that do not share homology with known IDTgenes were assigned letters that are not shared with any of theconfirmed IDT genes. Notably, the cluster contains two sets ofparalogous genes (share >40% amino acid identity with each other), whichwe have distinguished using numbers (i.e. nodD1|nodD2 and nodY1|nodY2).Closest matches were identified using BLASTp (protein-protein BLAST)against the non-redundant protein sequence database with ‘expectthreshold’ set at 10 and ‘word size’ set at 6. The BLOSUM62 scoringmatrix was applied with a gap opening penalty of 11 and a gap extensionpenalty of 1 with conditional compositional score matrix adjustment.

TABLE 2 Similarity matrix of geranylgeranyl transferases (‘C’ enzymes).Enzyme PaxC NodC LtmC AtmC JanC PenC PaxC 100 52.3 35.3 54.3 70.9 60.5NodC 66.7 100 38.8 55.1 55.4 54.4 LtmC 54 61.7 100 40.2 36.4 39.5 AtmC69.9 70.5 61.5 100 58.3 63.6 JanC 79.5 70.3 55.5 74.1 100 66.7 PenC 73.470.5 59.1 78 79.2 100

Numbers in italics represent % identity scores. Numbers in boldrepresent % similarity scores for amino acid residues.

TABLE 3 Similarity matrix of 3-gernaylkgeranylindole epoxidases (‘M’enzymes). Enzyme LtmM NodM AtmM PenM JanM PaxM LtmM 100 37.3 38.2 36.836.9 37.9 NodM 57.2 100 48.3 51.2 52.9 48.6 AtmM 56.3 63.4 100 48.3 47.948.9 PenM 54.8 66 62.8 100 61.6 60.6 JanM 54.7 67 61.6 75.1 100 66.7PaxM 56.6 65.8 64.4 74 79.9 100

Numbers in italics represent % identity scores and numbers in boldrepresent % similarity scores for amino acid residues.

TABLE 4 Similarity matrix of indole diterpene cyclases (‘B’ enzymes).Enzyme PaxB NodB LtmB AtmB JanB PenB PaxB 100 63 49.6 62.1 77 72.4 NodB78.2 100 48.8 64.2 65.4 67.9 LtmB 65.6 63.5 100 48.8 51.6 52 AtmB 7778.2 65.2 100 67.5 70.8 JanB 87.2 77.8 66.4 79.8 100 78.2 PenB 87.7 80.266.4 82.3 86.4 100

Numbers in italics represent % identity scores and numbers in boldrepresent % similarity scores for amino acid residues.

TABLE 5 Similarity matrix of indole diterpene prenyl transferases (‘D’and ‘E’ enzymes compared to NodD1 and NodD2). Enzyme NodD2 PaxD NodD1JanD AtmD LtmE PenD PenE NodD2 100 42.3 44.7 45 31.6 11.3 32.6 23.3 PaxD61.9 100 44.9 65.8 31.3 11.4 31.4 24.3 NodD1 60.7 63.6 100 49.2 29.211.2 29.7 22.7 JanD 63.1 80.6 65.6 100 30.5 11.7 31.9 25.2 AtmD 49.649.4 47.6 50.2 100 11.2 28.6 25.2 LtmE 20 20.7 21 21.6 19.5 100 10.811.6 PenD 54.4 53.5 50.7 51.9 48.1 20.9 100 24.3 PenE 41.1 40.5 40.241.3 37.4 21.7 41.3 100

Numbers in italics represent % identity scores and bold numbersrepresent % similarity scores for amino acid residues.

TABLE 6 Similarity matrix of indole diterpene FAD dependent oxidativecyclases (‘O’ enzymes). Enzyme PenO JanO Nodo PaxO PenO 100 42.9 44.940.3 JanO 59.3 100 50.7 71.9 NodO 61.9 69.2 100 48.7 PaxO 56.9 84 67 100

Numbers in italics represent % identity scores and numbers in boldrepresent % similarity scores for amino acid residues.

TABLE 7 Table of fungal species used in this study. Hypoxylon Indolediterpene pulicicidum phenotype (Nodulisporium Nodulisporic sp.) strainDescription acid A Source^(reference(s)) ATCC ® 74245 ™ Wild type +ATCC²⁵ Penicillium Indole diterpene paxilli phenotype strain DescriptionPaspaline Paxilline Source^(reference(s)) PN2013 Wild type + + BarryScott, Massey (ATCC ®26601 ™) University²⁴ PN2250 (CY2) PN2013/Deletionof entire PAX − − Barry Scott, Massey locus (ΔPAX); Hyg^(R) University¹⁵PN2257 PN2013/ΔpaxM::P_(glcA)-hph-T_(trpC); − − Barry Scott, MasseyHyg^(R) University¹⁵ PN2290 PN2013/ΔpaxC::P_(trpC)-hph; Hyg^(R) − −Barry Scott, Massey University²⁸

TABLE 8PCR primers for amplification of transcription unit modules (TUMs). TUMPrimer name Primer sequence (5′ to 3′) Hypoxylon pulicicidum primersnodW nodW_(CDS) P4502 frag 1 F cgatgtacgtctcaCTCG

actttagctattttaggcatcagttgcc P4502 frag 1 RactgctcgtctcaACTCccgctgcgagccgct P4502 frag 2 FacgtaccgtctccGAGTccggtcctggtggagtgatc P4502 frag 2 RgacctttcgtctctGTCTca

ctaagttatgcccagatatttccag nodM nodM_(CDS) nodM frag1 FcgatgtacgtctcaCTCG AATGtctacccctgagttcaagg nodM frag1 RcagtcacgtctcaACGCctctcaagaacgatgtgggaaattc nodM frag2 FgtgcatcgtctcaGCGTagtgtaatcgcaccagag nodM frag2 R gacctttcgtctctGTCTca

ctatgaagcgatgtctctaatatggagtaac nodB nodB_(CDS) nodB FcgatgtacgtctcaCTCG AATGgatggattcgatcgttccaatg nodB RgacctttcgtctctGTCTca

ttattgagccttccgcgcattg nodC nodC_(CDS) nodC frag1 F cgatgtacgtctcaCTCGAATGtccttaggtttacagtgcttgg nodC frag1 RcattgacgtctcgGTCAcgtcgccaaaccagcga nodC frag2 FgtcacgcgtctctTGACggcctcactagctttcc nodC frag2 R gacctttcgtctctGTCTca

tcaatgcgtaagatcgagtttctcctttct Penicillium paxilli primers paxGpaxG_(ProUTR) PpaxG F cgatgtacgtctcaCTCG GGAGattcacgacctgtgactagtcaaPpaxG R gacctttcgtctctGTCTca

ggcgtcgaacttgatgaagttttc paxG_(CDS) paxG frag1 F cgatgtacgtctcaCTCGAATGtcctacatccttgcagaag paxG frag1 RcttctacgtctcgTACTgttctaatcgtgcttggtg paxG frag2 FgcacgacgtctccAGTAcaggtgctagaagatgacgttgac paxG frag2 RaggcgccgtctccACCAatctctttcaatcttgcttgttgga paxG frag3 FgattgacgtctctTGGTgacccccgcgcctt paxG frag3 RgtcgaccgtctctTTCCctagtatattggaagctccccg paxG frag4 FtccaatcgtctcgGGAAaccctaagtcgacttagtgcg paxG frag4 R gacctttcgtctctGTCTca

ttaaactcttcctttctcattagtaggg paxG_(UTRterm) TpaxG F cgatgtacgtctcaCTCGGCTTtcaatcgtgctgcatttctctt TpaxG R gacctttcgtctctGTCTca

tcactcccgagcaatattgct paxC paxC_(ProUTR) PpaxC F2 cgatgtacgtctcaCTCGGGAGacaacaaaaagatcagccaatgg PpaxC R2 gacctttcgtctctGTCTca

aaaatgggacctacaccctgaa paxC_(CDs) paxC frag1 F cgatgtacgtctcaCTCGAATGggcgtagcaggga paxC frag1 R cattgacgtctccACGGcgccagacaagggapaxC frag2 F cccttgcgtctcgCCGTgacggagtcaatgggttc paxC frag2 RgacctttcgtctctGTCTca

tcatgccttcaggtcaagcttc paxC_(UTRterm) TpaxC F cgatgtacgtctcaCTCGGCTTttggccttgtgaaatatgggactac TpaxC R gacctttcgtctctGTCTca

atctctgtcatgtcggatatcagat paxM paxM_(ProUTR) PpaxM F cgatgtacgtctcaCTCGGGAGgttgttggcatgggagtaggat PpaxM R gacctttcgtctctGTCTca

ggtttctgaatcttaaagatacatgaaaag paxM_(CDS) paxM frag1 FcgatgtacgtctcaCTCG AATGgaaaaggccgagtttcaag paxM frag1 RtgacaacgtctcgTCCAtcgaataaagcgttgacttgc paxM frag2 FacgcttcgtctcaTGGActcactattgtcacaatccatggaaaag paxM frag2 RgacctttcgtctctGTCTca

ttaaacttgaagaaaataaaacttcagggcac paxM_(UTRterm) TpaxM frag1 FcgatgtacgtctcaCTCG GCTTaccattggagcaatttttggttttc TpaxM frag1 RgttcgccgtctcgACTCgattgcttgtgggtct TpaxM frag2 FacaagccgtctccGAGTccagccagcgaacttg TpaxM frag2 R gacctttcgtctctGTCTca

ttttggcttacttcagtttaactgttttg paxB paxB_(ProUTR) PpaxB FcgatgtacgtctcaCTCG GGAGaaggctgtgttggagagaatc PpaxB RgacctttcgtctctGTCTca

agtttctaaggttgacgtgggaaaaag paxB_(CDS) paxB F cgatgtacgtctcaCTCGAATGgacggttttgatgtttcccaa paxB R gacctttcgtctctGTCTca

tcaatttgcttttttcggcccgcttatgc paxB_(UTRterm) TpaxB F cgatgtacgtctcaCTCGGCTTtcggcagttgagggtgaaac TpaxB R gacctttcgtctctGTCTca

ggttaacaatgaggaacgatgaacag Additional primers trpC trpC_(ProUTR)PtrpC frag1 F cgatgtacgtctcaCTCG GGAGgaattcatgccagttgttcccagPtrpC frag1 R cgatgtacgtctca

ggccgactcgctg PtrpC frag2 F cacctttcgtctcc

agacgtgaagcaggacgg PtrpC frag2 R cgatgtcgtctcgCAGAccattgcacaagcctcPtrpC frag3 F gacctttcgtctcgTCTGcgcatggatcgctgc PtrpC frag3 RgacctttcgtctctGTCTca

atcgatgcttgggtagaataggtaag trpC_(UTRterm) T trpC frag1 FcgatgtacgtctcaCTCG GCTTgatccacttaacgttactgaaatcatcaaac T trpC frag1 RgacctttcgtctctCTGCttgatctcgtctgccga T trpC frag2 FcgatgtacgtctcaGCAGatcaacggtcgtcaaga T trpC frag2 R gacctttcgtctctGTCTca

tctagaaagaaggattacctctaaacaagtgt nptII nptII_(CDS) ntpII FcgatgtacgtctcaCTCG AATGattgaacaagatggattgcacg ntpII RgacctttcgtctctGTCTca

ctcagaagaactcgtcaagaaggc

The forward and reverse PCR primers used for amplification of TUMs (i.e.promoters (ProUTR), coding sequences (CDSs), and terminators (UTRterm))are listed. Primers used to amplify TUM fragments for domesticationpurposes (i.e. removal of internal sites for AarI, BsaI or BsmBI) areunderlined (e.g., P4502 frag 1 R). The template for amplification of nodCDSs was genomic DNA from Hypoxylon pulicicidum strain ATCC® 74245™.²⁵The template for amplification of pax gene TUMs was genomic DNA fromPenicillium paxilli strain ATCC® 26601™ (PN2013).²⁴ The PCR productsused to produce the trpC ProUTR module, nptII CDS module (conferringresistance to geneticin), and trpC_(UTRterm) module were all amplifiedfrom plasmid pII99.⁴¹ The BsmBI recognition sites are shown in boldlower case text (cgtctc), with the overhangs generated following BsmBIcleavage shown by the upper case italics text. The 5′ (prefix) and 3′(suffix) nucleotide bases, which flank each TUM and form the basis ofthe address system for each of the MIDAS modules, are shown in boldupper case text, and bold upper case italic text respectively.

TABLE 9 MIDAS Level-1 plasmid library: Assembly of TUMs in pML1. [GGAG][AATG] [GCTT] [CGCT] ProUTR modules CDS modules UTRterm modules PlasmidPlasmid Plasmid name Description name Description name Description pSK1paxG_(ProUTR) pKV45 nodW_(CDS) pSK3 paxG_(UTRterm) pKV28 paxC_(ProUTR)pKV59 nodM_(CDS) pSK12 paxC_(UTRterm) pSK4 paxM_(ProUTR) pSK18 pSK6paxM_(UTRterm) pSK7 paxB_(ProUTR) pSK19 nodB_(CDS) pSK9 paxB_(UTRterm)pSK17 trpC_(ProUTR) pSK20 nodC_(CDS) pSK15 trpC_(UTRterm) pSK2paxG_(CDS) pSK11 paxC_(CDS) pSK5 paxM_(CDS) pSK16 nptII_(CDS)

This table represents the MIDAS level-1 TUMs that we used to assembleMIDAS level-2 TUs (Table 10). The 4 base prefixes and suffixes (5′ to3′) that flank each TUM are shown at the top of the table to highlightthe sequences used to bind the TUMs together to make MIDAS level-2 TUs.These 4 base flanking regions are depicted in the primer table (Table 8)in bold upper case text (forward addresses) and bold upper case italictext (reverse addresses).

TABLE 10 MIDAS Level-2 plasmid library: Assembly of TUs in pML2destination vectors Level-1 entry clones used for pML2 TU assemblydestination Level-2 entry clones TU ProUTR CDS UTRterm vector NameDescription nodW pSK17 pKV45 pSK15 pML2(+)WR pKV52

 (T_(trpC)-nodW- P_(trpC)):pML2(+)WR pML2(+)BR pSK67

 (T_(trpC)-nodW- P_(trpC)):pML2(+)BR nodM pSK4 pKV59 pSK6 pML2(+)BFpKV57 (P_(trpC)-nodM- T_(trpC)) 

 :pML2(+)BF pSK18 pML2(+)WF pSK28 (P_(trpC)-nodM- T_(trpC)) 

 :pML2(+)WF nodB pSK7 pSK19 pSK9 pML2(+)BR pSK29

 (T_(trpC)-nodB- P_(trpC)):pML2(+)BR nodC pSK17 pSK20 pSK15 pML2(+)BFpKV26 (P_(trpC)-nodC- T_(trpC)) 

 :pML2(+)BF pKV28 pSK12 pML2(+)WF pSK60 (P_(trpC)-nodC- T_(trpC)) 

 :pML2(+)WF paxG pSK1 pSK2 pSK3 pML2(+)BR pSK21

 (T_(paxG)-paxG- P_(paxG)):pML2(+)BR paxC pKV28 pSK11 pSK12 pML2(+)WFpSK59 (P_(paxC)-paxC- T_(paxC)) 

 :pML2(+)WF paxM pSK4 pSK5 pSK6 pML2(+)WR pSK22

 (T_(paxM)-paxM- P_(paxM)):pML2(+)WR nptII pSK17 pSK16 pSK15 pML2(+)WFpSK26 (P_(trpC)-nptII- T_(trpC)) 

 :pML2(+)WF

This table represents the construction of the MIDAS level-2 TUs thatwere used to assemble MIDAS level-3 multi-gene plasmids for heterologousexpression studies. The names of the Level-2 entry plasmids produced areshown in bold. TUs are described by the CDS they contain and TUorientation, determined by the pML2 destination vector, is shown by thearrowhead (

for forward (F) destination vector and

for reverse (R) destination vector) in the Level-2 description.

TABLE 11 MIDAS Level-3 plasmid library: Multi-gene assemblies in pML3Golden Level-2 entry clone Destination Gate Product Level-3 plasmid StepName Description vector reaction Name Description Plasmid size (kb) 1pSK26 (P_(trpC)-nptII- pML3 AarI pKV22 pML3:nptII 

5.6 T_(trpC)) 

 :pML2(+)WF 2 pKV26 (P_(trpC)-nodC- pKV22 BsmBI pKV27 pML3:nptII 

 :nodC 

9.0 T_(trpC)) 

 :pML2(+)BF 2 pKV57 (P_(trpC)-nodM- pKV22 BsmBI pKV63 pML3:nptII 

 :nodM 

9.4 T_(trpC)) 

 :pML2(+)BF 3 pKV52

 (T_(trpC)-nodW- pKV63 AarI pKV64 pML3:nptII 

 : 13.0 P_(trpC)):pML2(+)WR nodM 

 : 

 nodW 1 pSK26 (P_(trpC)-nptII- pML3 AarI pSK33 pML3:nptII 

5.6 T_(trpC)) 

 :pML2(+)WF 2 pSK21

 (T_(paxG)-paxG- pSK33 BsmBI pSK34 pML3:nptII 

 : 

 paxG 8.2 P_(paxG)):pML2(+)BR 3 pSK22

 (T_(paxM)-paxM- pSK34 AarI pSK36 pML3:nptII 

 : 

  11.5 P_(paxM)):pML2(+)WR paxG: 

 paxM 4 pSK29

 (T_(trpC)-nodB- pSK36 BsmBI pKV73 pML3:nptII 

 : 

 paxG: 

14.1 P_(trpC)):pML2(+)BR paxM: 

 nodB 5 pSK59 (P_(paxC)-paxC- pSK73 AarI pKV74 pML3:nptII 

 : 

 paxG: 

16.3 T_(paxC)) 

 :pML2(+)WF paxM: 

 nodB:paxC 

3 pSK28 (P_(trpC)-nodM- pSK34 AarI pSK35 pML3:nptII 

 : 

  11.5 T_(trpC)) 

 :pML2(+)WF paxG:nodM 

4 pSK29

 (T_(trpC)-nodB- pSK35 BsmBI pSK38 pML3:nptII 

 : 

 paxG: 14.1 P_(trpC)):pML2(+)BR nodM 

 : 

 nodB 5 pSK60 (P_(trpC)-nodC- pSK38 AarI pSK66 pML3:nptII 

 : 

 paxG: 16.3 T_(trpC)) 

 :pML2(+)WF nodM 

 : 

 nodB:nodC 

6 pSK67

 (T_(trpC)-nodW- pSK66 BsmBI pSK68 pML3:nptII 

 : 

 paxG: 20.5 P_(trpC)):pML2(+)BR nodM 

 : 

 nodB: nodC 

 : 

 nodW

The table shows the Level-2 entry clone and Level-3 destination vectorsused to construct the multi-gene plasmids. The names of the plasmidsproduced during each cycle of Level-3 assembly are shown in bold. Thenumber of level 3 assembly reactions used to create the level-3 plasmidis indicated by number in the step column. TUs are annotated with thename of the CDS they contain. TU orientation is shown by the arrowhead.

TABLE 12Generalised primer design for amplification of ProUTR, CDS and UTRtermmodules to be cloned into pML1. TUM Primer Primer sequence (5′ to 3′)[GGAG]- Forward5′-cgatgtacgtctcaCTCGGGAG(+ 18-25 bases specific for the 5′ end ofProUTR- the promoter)-3′ [

] Reverse 5′-gacctttcgtctctGTCTca

(+ 18-25 bases specific for the 3′ end of the 5′UTR)-3′The CAT sequence (reverse-complement = ATG) underlined within the 

 module-specific nucleotides, specifies thetranslation initiation codon for the CDS of interest, while the finalT (not underlined) represents the base immediately upstream of theinitiation codon. [A ATG ]- Forward 5′-cgatgtacgtctcaCTCG A ATG(+ 18-25 bases specific for the 5′ CDS-end of the CDS, beginning at the 2^(nd) codon)-3′ [

] The ATG sequence (underlined within the A ATG  module-specificnucleotides) specifies the translation initiation codon for the CDSof interest, while the initial A (not underlined) represents the baseimmediately upstream of the initiation codon. Reverse5′-gacctttcgtctctGTCTca

*(+ 18-25 bases specific for the 3′ end of the CDS)-3′Remember to include a stop codon (*) at the end of the CDS. [GCTT]-Forward 5′-cgatgtacgtctcaCTCG GCTT(+ 18-25 bases specific for the 5′UTRterm- end of the 3′UTR)-3′ [

] Reverse 5′-gacctttcgtctctGTCTca

(+ 18-25 bases specific for the 3′ end of the terminator)-3′

Generalised features of forward and reverse PCR primers used foramplification of TUMs are listed. The BsmBI recognition sites are shownin lower case bold (cgtctc), with the overhangs generated followingBsmBI cleavage shown by upper case italics (e.g., CTCG). The 5′ and 3′nucleotide-specific bases, which flank each TUM and form the basis ofthe address system for each of the MIDAS modules, are shown in uppercase bold (e.g., GGAG) and upper case bold italics (e.g., CATT),respectively.

TABLE 13 Level-3 multigene assemblies are constructed by alternatingGolden Gate cloning reactions using TUs assembled in “White” and “Blue”pML2 vectors. Level-2 Golden entry Gate Step clone Destination plasmidreaction Product plasmid Screen 1 TU1 in a pML3 AarI- pML3:TU1 WhiteWhite mediated colonies pML2 vector 2 TU2 in a pML3:TU1 BsmBI-pML3:TU1:TU2 Blue Blue mediated colonies pML2 vector 3 TU3 in apML3:TU1:TU2 AarI- pML3:TU1:TU2:TU3 White White mediated colonies pML2vector 4 TU4 in a pML3:TU1:TU2:TU3 BsmBI- pML3:TU1:TU2:TU3:TU4 Blue Bluemediated colonies pML2 vector

The table shows the cloning steps used to produce a hypotheticalmultigene construct containing four TUs, with each row depicting theinput plasmids (Level-2 entry clone and destination plasmid), the typeof Golden Gate reaction used for assembly, the product plasmid and thetype of colonies screened.

TABLE 14 Multistep acetonitrile gradient used for LC-MS analysis offungal extracts. Time % (v/v) of acetonitrile + (minutes) 0.01% (v/v)formic acid 0 50 1 50 15 70 20 95 25 95 28 50 38 50

INDUSTRIAL APPLICATION

The invention has industrial application in the production of indolediterpene compounds, particularly NAs.

REFERENCES

-   (1) Ogata, M.; Ueda, J.; Hoshi, M.; Hashimoto, J.; Nakashima, T.;    Anzai, K.; Takagi, M.; Shin-ya, K. J. Antibiot. (Tokyo) 2007, 60    (10), 645-648.-   (2) Nakazawa, J.; Yajima, J.; Usui, T.; Ueki, M.; Takatsuki, A.;    Imoto, M.; Toyoshima, Y. Y.; Osada, H. Chem. Biol. 2003, 10 (2),    131-137.-   (3) Sallam, A. A.; Ayoub, N. M.; Foudah, A. I.; Gissendanner, C. R.;    Meyer, S. A.; El Sayed, K. A. Eur. J. Med. Chem. 2013, 70, 594-606.-   (4) Fan, Y.; Wang, Y.; Liu, P.; Fu, P.; Zhu, T.; Wang, W.;    Zhu, W. J. Nat. Prod. 2013, 76 (7), 1328-1336.-   (5) Meinke, P. T.; Smith, M. M.; Shoop, W. L. Curr. Top. Med. Chem.    2002, 2 (7), 655-674.-   (6) Knaus, H.-G.; McManus, 0. B.; Lee, S. H.; Schmalhofer, W. A.;    Garcia-Calvo, M.; Helms, L. M. H.; Sanchez, M.; Giangiacomo, K.;    Reuben, J. P. Biochemistry (Mosc.) 1994, 33 (19), 5819-5828.-   (7) Bills, G. F.; González-Menendez, V.; Martin, J.; Platas, G.;    Fournier, J.; Peroh, D.; Stadler, M. PLOS ONE 2012, 7 (10), e46687.-   (8) Shoop, W. L.; Gregory, L. M.; Zakson-Aiken, M.; Michael, B. F.;    Haines, H. W.; Ondeyka, J. G.; Meinke, P. T.; Schmatz, D. M. J.    Parasitol. 2001, 87 (2), 419-423.-   (9) Byrne, K. M.; Smith, S. K.; Ondeyka, J. G. J. Am. Chem. Soc.    2002, 124 (24), 7055-7060.-   (10) Smith, A. B.; Davulcu, A. H.; Cho, Y. S.; Ohmoto, K.; Kürti,    L.; Ishiyama, H. J. Org. Chem. 2007, 72 (13), 4596-4610.-   (11) Zou, Y.; Melvin, J. E.; Gonzales, S. S.; Spafford, M. J.;    Smith, A. B. J. Am. Chem. Soc. 2015, 137 (22), 7095-7098.-   (12) Melvin, J. E. Diss. Available ProQuest 2014, 1-273.-   (13) Young, C.; McMillan, L.; Telfer, E.; Scott, B. Mol. Microbiol.    2001, 39 (3), 754-764.-   (14) Saikia, S.; Parker, E. J.; Koulman, A.; Scott, B. J. Biol.    Chem. 2007, 282 (23), 16829-16837.-   (15) Scott, B.; Young, C. A.; Saikia, S.; McMillan, L. K.;    Monahan, B. J.; Koulman, A.; Astin, J.; Eaton, C. J.; Bryant, A.;    Wrenn, R. E.; Finch, S. C.; Tapper, B. A.; Parker, E. J.;    Jameson, G. B. Toxins 2013, 5 (8), 1422-1446.-   (16) Tagami, K.; Liu, C.; Minami, A.; Noike, M.; Isaka, T.; Fueki,    S.; Shichijo, Y.; Toshima, H.; Gomi, K.; Dairi, T.; Oikawa, H. J.    Am. Chem. Soc. 2013, 135 (4), 1260-1263.-   (17) Motoyama, T.; Hayashi, T.; Hirota, H.; Ueki, M.; Osada, H.    Chem. Biol. 2012, 19 (12), 1611-1619.-   (18) Saikia, S.; Takemoto, D.; Tapper, B. A.; Lane, G. A.; Fraser,    K.; Scott, B. FEBS Lett. 2012, 586 (16), 2563-2569.-   (19) Tagami, K.; Minami, A.; Fujii, R.; Liu, C.; Tanaka, M.; Gomi,    K.; Dairi, T.; Oikawa, H. ChemBioChem 2014, 15 (14), 2076-2080.-   (20) Liu, C.; Tagami, K.; Minami, A.; Matsumoto, T.; Frisvad, J. C.;    Suzuki, H.; Ishikawa, J.; Gomi, K.; Oikawa, H. Angew. Chem. Int. Ed.    2015, 54 (19), 5748-5752.-   (21) Tang, M.-C.; Lin, H.-C.; Li, D.; Zou, Y.; Li, J.; Xu, W.;    Cacho, R. A.; Hillenmeyer, M. E.; Garg, N. K.; Tang, Y. J. Am. Chem.    Soc. 2015, 137 (43), 13724-13727.-   (22) Liu, C.; Minami, A.; Dairi, T.; Gomi, K.; Scott, B.; Oikawa, H.    Org. Lett. 2016, 18 (19), 5026-5029.-   (23) Saikia, S.; Scott, B. Mol. Genet. Genomics 2009, 282 (3),    257-271.-   (24) Itoh, Y.; Johnson, R.; Scott, B. Curr. Genet. 1994, 25 (6),    508-513.-   (25) Dombrowski, A. W.; Endris, R. G.; Helms, G. L.; Hensens, 0. D.;    Ondeyka, J. G.; Ostlind, D. A.; Polishook, J. D.; Zink, D. L. U.S.    Pat. No. 5,399,582-Antiparasitic agents. 5399582, Mar. 21, 1995.-   (26) Byrd, A. D.; Schardl, C. L.; Songlin, P. J.; Mogen, K. L.;    Siegel, M. R. Curr. Genet. 1990, 18 (4), 347-354.-   (27) Engler, C.; Kandzia, R.; Marillonnet, S. PLOS ONE 2008, 3 (11),    e3647.-   (28) Sarrion-Perdigones, A.; Falconi, E. E.; Zandalinas, S. I.;    Juarez, P.; Fernandez-del-Carmen, A.; Granell, A.; Orzaez, D. PLOS    ONE 2011, 6 (7), e21622.-   (29) Sarrion-Perdigones, A.; Vazquez-Vilar, M.; Palací, J.;    Castelijns, B.; Forment, J.; Ziarsolo, P.; Blanca, J.; Granell, A.;    Orzaez, D. Plant Physiol. 2013, 162 (3), 1618-1631.-   (30) De Paoli, H. C.; Tuskan, G. A.; Yang, X. Sci. Rep. 2016, 6.-   (31) Weber, E.; Engler, C.; Gruetzner, R.; Werner, S.;    Marillonnet, S. PLOS ONE 2011, 6 (2), e16765.-   (32) Binder, A.; Lambert, J.; Morbitzer, R.; Popp, C.; Ott, T.;    Lahaye, T.; Parniske, M. PLOS ONE 2014, 9 (2), e88218.-   (33) Patron, N. J.; Orzaez, D.; Marillonnet, S.; Warzecha, H.;    Matthewman, C.; Youles, M.; Raitskin, O.; Leveau, A.; Farre, G.;    Rogers, C.; Smith, A.; Hibberd, J.; Webb, A. A. R.; Locke, J.;    Schornack, S.; Ajioka, J.; Baulcombe, D. C.; Zipfel, C.; Kamoun, S.;    Jones, J. D. G.; Kuhn, H.; Robatzek, S.; Van Esse, H. P.; Sanders,    D.; Oldroyd, G.; Martin, C.; Field, R.; O'Connor, S.; Fox, S.;    Wulff, B.; Miller, B.; Breakspear, A.; Radhakrishnan, G.; Delaux,    P.-M.; Logue, D.; Granell, A.; Tissier, A.; Shih, P.; Brutnell, T.    P.; Quick, W. P.; Rischer, H.; Fraser, P. D.; Aharoni, A.; Raines,    C.; South, P. F.; Ane, J.-M.; Hamberger, B. R.; Langdale, J.;    Stougaard, J.; Bouwmeester, H.; Udvardi, M.; Murray, J. A. H.;    Ntoukakis, V.; Schäfer, P.; Denby, K.; Edwards, K. J.; Osbourn, A.;    Haseloff, J. New Phytol. 2015, 208 (1), 13-19.-   (34) Beck, R.; Burtscher, H. Nucleic Acids Res. 1994, 22 (5),    886-887.-   (35) Agmon, N.; Mitchell, L. A.; Cai, Y.; Ikushima, S.; Chuang, J.;    Zheng, A.; Choi, W.-J.; Martin, J. A.; Caravelli, K.; Stracquadanio,    G.; Boeke, J. D. ACS Synth. Biol. 2015, 4 (7), 853-859.-   (36) Miyazaki, K. BioTechniques 2015, 58 (2), 86-88.-   (37) Yelton, M. M.; Hamer, J. E.; Timberlake, W. E. Proc. Natl.    Acad. Sci. 1984, 81 (5), 1470-1474.-   (38) Vollmer, S. J.; Yanofsky, C. Proc. Natl. Acad. Sci. 1986, 83    (13), 4869-4873.-   (39) Oliver, R. P.; Roberts, I. N.; Harling, R.; Kenyon, L.;    Punt, P. J.; Dingemanse, M. A.; van den Hondel, C. A. M. J. J. Curr.    Genet. 1987,12 (3), 231-233.

What we claim is:
 1. An isolated polypeptide comprising an amino acidsequence selected from the group consisting of NodW (SEQ ID NO:3), NodR(SEQ ID NO:6), NodJ (SEQ ID NO:21), NodY1 (SEQ ID NO:27), NodY2 (SEQ IDNO:36), NodX (SEQ ID NO:9), NodM (SEQ ID NO:12), NodB (SEQ ID NO:15),NodO (SEQ ID NO:18), NodC (SEQ ID NO:24), NodD2 (SEQ ID NO:30), NodD1(SEQ ID NO:33), NodZ (SEQ ID NO:39), and NodS (SEQ ID NO:50) or afunctional variant or fragment thereof.
 2. An isolated polynucleotideencoding a polypeptide of claim
 1. 3. An isolated polynucleotidecomprising at least 70% preferably at least 80%, preferably at least90%, preferably at least 95%, preferably 99% nucleic acid sequenceidentity to a nucleic acid sequence selected from the group consistingof SEQ ID NO: 1, 2, 4, 5, 19, 20, 25, 26, 34, 35, 7, 8, 10, 11, 13, 14,16, 17, 22, 23, 28, 29, 31, 32, 37, 38, 48 and
 49. 4. A transcriptionunit (TU) comprising at least one isolated polynucleotide of claim
 3. 5.A vector that encodes an isolated polypeptide of claim
 1. 6. A vectorcomprising an isolated polynucleotide of claim 2, or a TU of claim
 4. 7.An isolated host cell comprising an isolated polypeptide of claim 1, anisolated polynucleotide of claim 2 or claim 3, a TU of claim 4 and/or avector of claim 5 or claim
 6. 8. A method of making at least oneNodulisporic Acid (NA) comprising heterologously expressing at least oneisolated polypeptide of claim 1, isolated polynucleotide of claim 2 orclaim 3, a TU of claim 4 and/or a vector of claim 5 or claim 6 in anisolated host cell.
 9. An isolated polypeptide or functional fragment orvariant thereof from Hypoxylon spp. that catalyzes a biochemicalreaction in the biosynthetic pathway leading from 3-geranylgeranylindole (GGI) 2 to NAA
 10. 10. An isolated polynucleotide encoding atleast one polypeptide or functional variant or fragment thereof fromHypoxylon spp. that catalyzes a biochemical reaction in the biosyntheticpathway leading from GGI 2 to NAA
 10. 11. A method of making at leastone Hypoxylon spp. polypeptide or functional variant or fragment thereofcomprising heterologously expressing an isolated polynucleotide of claim2 or claim 3 or a vector of claim 5 or claim 6 in an isolated host cell.12. A method of making at least one NA comprising heterologouslyexpressing in an isolated host cell, at least one polypeptide thatcatalyzes a biochemical reaction in the biosynthetic pathway leadingfrom GGI 2 to NAA
 10. 13. An isolated host cell that expresses at leastone heterologous polypeptide that catalyzes the transformation of asubstrate in the biosynthetic pathway leading from GGI 2 to theformation of NAA
 10. 14. An isolated host cell that produces byheterologous expression, at least one polypeptide involved in thebiosynthetic pathway leading from GGI 2 to NAA
 10. 15. A method ofproducing at least one NA comprising contacting a carbohydratecomprising substrate with a recombinant cell transformed with a nucleicacid that results in an increased level of activity of a polypeptideselected from the group consisting of SEQ ID NO: 3, 6, 21, 27, 36, 9,12, 15, 18, 24, 30, 33, 39 and 50 or a functional variant or fragmentthereof compared to the cell prior to transformation, such that thesubstrate is metabolized to at least one NA.
 16. An isolated strain ofHypoxylon pulicicidum that comprises at least one heterologous nucleicacid sequence encoding an enzyme in a biosynthetic pathway leading toNAA
 10. 17. An isolated strain of Hypoxylon pulicicidum that expressesat least two different GGPPS enzymes.
 18. An isolated strain ofHypoxylon pulicicidum that comprises a genetic modification that leadsto an increased biosynthesis of NAA
 10. 19. A method of making NAA 10comprising expressing at least one heterologous nucleic acid sequence inHypoxylon pulicicidum, wherein the at least one heterologous nucleicacid sequence encodes an enzyme in a biosynthetic pathway leading to NAA10.